0% found this document useful (0 votes)
25 views30 pages

ML & DL Notes

The document provides an overview of machine learning (ML) and its various types, including supervised, unsupervised, and reinforcement learning, along with their respective algorithms and applications. It explains key concepts such as regression, classification, decision trees, and random forests, as well as performance metrics like accuracy, precision, recall, and F1 score. Additionally, it discusses the importance of confusion matrices and ROC curves in evaluating model performance.

Uploaded by

devesharma120
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views30 pages

ML & DL Notes

The document provides an overview of machine learning (ML) and its various types, including supervised, unsupervised, and reinforcement learning, along with their respective algorithms and applications. It explains key concepts such as regression, classification, decision trees, and random forests, as well as performance metrics like accuracy, precision, recall, and F1 score. Additionally, it discusses the importance of confusion matrices and ROC curves in evaluating model performance.

Uploaded by

devesharma120
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

ML & DL

notes
What is machine learning?

• Machine Learning is a subset of artificial intelligence(AI) that focus on learning


from data to develop an algorithm that can be used to make a prediction.

• Machine Learning uses a data-driven approach, It is typically trained on historical


data and then used to make predictions on new data.

• ML can find patterns and insights in large datasets that might be difficult for
humans to discover.
Types of Machine Learning.

• Supervised learning is a type of machine learning in which the algorithm is


trained on the labeled dataset. It learns to map input features to targets based
on labeled training data. In supervised learning, the algorithm is provided with
input features and corresponding output labels, and it learns to generalize from
this data to make predictions on new, unseen data.

• There are two main types of supervised learning:

Regression Classification
Regression

• Regression is a type of supervised learning where the algorithm learns to predict

continuous values based on input features. The output labels in regression are

continuous values, such as stock prices, and housing prices. The different

regression algorithms in machine learning are: Linear Regression, Polynomial

Regression, Decision Tree Regression, Random Forest Regression, etc


Classification

• Classification is a type of supervised learning where the algorithm learns to


assign input data to a specific category or class based on input features. The
output labels in classification are discrete values. Classification algorithms can be
binary, where the output is one of two possible classes, or multiclass, where the
output can be one of several classes. The different Classification algorithms in
machine learning are: Logistic Regression, Decision Tree, Support Vector Machine
(SVM), K-Nearest Neighbors (KNN), etc
Types of Machine Learning.

• Unsupervised learning is a type of machine learning where the algorithm learns


to recognize patterns in data without being explicitly trained using labeled
examples. The goal of unsupervised learning is to discover the underlying
structure or distribution in the data.

• There are two main types of supervised learning:

Clustering Dimensionality reduction


Clustering and Dimensionality reduction

• Clustering algorithms group similar data points together based on their characteristics.
The goal is to identify groups, or clusters, of data points that are similar to each other,
while being distinct from other groups. Some popular clustering algorithms include K-
means, Hierarchical clustering, and DBSCAN.

• Dimensionality reduction algorithms reduce the number of input variables in a dataset


while preserving as much of the original information as possible. This is useful for
reducing the complexity of a dataset and making it easier to visualize and analyze. Some
popular dimensionality reduction algorithms include Principal Component Analysis (PCA),
t-SNE, and Autoencoders.
Types of Machine Learning.

• Reinforcement learning is a type of machine learning where an agent learns to


interact with an environment by performing actions and receiving rewards or
penalties based on its actions. The goal of reinforcement learning is to learn a
policy, which is a mapping from states to actions, that maximizes the expected
cumulative reward over time.

• There are two main types of supervised learning:

Model-based reinforcement learning Model–free reinforcement learning


Model based and Free reinforcement

• In model-based reinforcement learning, the agent learns a model of the environment,


including the transition probabilities between states and the rewards associated with each
state-action pair. The agent then uses this model to plan its actions in order to maximize
its expected reward. Some popular model-based reinforcement learning algorithms
include Value Iteration and Policy Iteration.

• In model-free reinforcement learning, the agent learns a policy directly from experience
without explicitly building a model of the environment. The agent interacts with the
environment and updates its policy based on the rewards it receives. Some popular
model-free reinforcement learning algorithms include Q-Learning, SARSA, and Deep
Reinforcement Learning.
Regression Algorithms

Linear regression is one of the simplest and most widely used statistical
models. This assumes that there is a linear relationship between the
independent and dependent variables. This means that the change in the
dependent variable is proportional to the change in the independent variables.

Polynomial regression is used to model nonlinear relationships between the


dependent variable and the independent variables. It adds polynomial terms to
the linear regression model to capture more complex relationships.
Regression Algorithm

• Logistic regression is used for binary classification where we use sigmoid


function, that takes input as independent variables and produces a probability
value between 0 and 1.

• For example, we have two classes Class 0 and Class 1 if the value of the logistic
function for an input is greater than 0.5 (threshold value) then it belongs to Class
1 otherwise it belongs to Class 0. It’s referred to as regression because it is the
extension of linear regression but is mainly used for classification problems.
Regression Metrics

• Mean Squared Error (MSE): It measures the average squared difference

between the predicted and the actual target values within a dataset. It gives a

sense of how far off the predictions are from the actual values, with a larger

penalty for larger errors.


Regression Metrics

• Mean Absolute Deviation (MAD) of a data set is the average distance


between each data point of the data set and the mean of data. i.e. it represents
the amount of variation that occurs around the mean value in the data set. It is
also a measure of spread. It is calculated as the average of the sum of the
absolute difference between each value of the data set and the mean.
Decision Tree

• A decision tree is a supervised learning algorithm used to model and predict

outcomes based on input data. It is a tree-like structure where:

• Each internal node represents a test on an attribute.

• Each branch corresponds to an attribute value.

• Each leaf node represents the final decision or prediction.

• Decision trees can be used for both classification and regression problems.
Decision Tree Terminologies

Root Node: The top node of the tree, representing the initial decision or feature
from which the tree starts branching out.

Internal Nodes (Decision Nodes): Nodes that make decisions based on the
values of specific attributes. These nodes have branches that lead to other nodes.

Leaf Nodes (Terminal Nodes): The end points of the branches, where final
decisions or predictions are made. Leaf nodes do not have further branches.
Decision Tree Terminologies

Branches (Edges): The connections between nodes that represent the decision
path taken based on certain conditions.

Splitting: The process of dividing a node into two or more sub-nodes based on a
decision rule, like selecting a feature and a threshold to create subsets of data.

Parent Node: A node that splits into child nodes. It is the original node from
which a split starts.
Decision Tree Terminologies

Child Node: Nodes that result from a split of a parent node.

Decision Criterion: The rule or condition used to split data at a decision node.
This involves comparing feature values against a threshold.

Pruning: The process of removing branches or nodes from a decision tree to


improve its generalization and prevent overfitting, which means making the model
simpler and more effective at predicting new data.
Random Forest

The Random Forest algorithm is a powerful machine learning technique that


improves prediction accuracy and reduces overfitting by using multiple decision
trees. Here's how it works:

Creating Multiple Trees: During the training phase, the algorithm creates many
decision trees.

1. Each tree is built using a random subset of the training data.

2. Each split in the trees uses a random subset of the features.


Random Forest

Introducing Randomness: This randomness in data and features ensures that


each tree is different, which helps prevent overfitting and makes the model more
robust.

Making Predictions:
For Classification: The algorithm takes a vote from all the trees. The class that
gets the most votes is the final prediction.
For Regression: The algorithm averages the predictions from all the trees to get
the final result.
Random Forest

Key Benefits of Random Forest

• Handles Complex Data: Works well with large datasets and many features.

• Reduces Overfitting: Less likely to overfit compared to a single decision tree

because it averages multiple trees.

• Reliable Predictions: Provides consistent and accurate forecasts across

different environments.
Advantages of Random forest

Reduced Overfitting: Random Forests average multiple trees, preventing overfitting,


unlike a single Decision Tree.

Higher Accuracy: Combining many trees improves prediction accuracy.

Robust to Noise: Random Forests handle noisy data better by averaging results from
various trees.

Handles High Dimensions: Works well with many features by using random subsets for
each tree.

Feature Importance: Provides reliable feature importance by averaging across trees.

Stability: More stable, with consistent predictions despite slight data changes.
Classification Accuracy

• Classification accuracy is the accuracy we generally mean, whenever we use the


term accuracy. We calculate this by calculating the ratio of correct predictions to
the total number of input Samples.

Accuracy = No. of correct predictions / Total number of input samples

• It works great if there are an equal number of samples for each class. For
example, we have a 90% sample of class A and a 10% sample of class B in our
training set.
Classification Accuracy

• Then, our model will predict with an accuracy of 90% by predicting all the
training samples belonging to class A. If we test the same model with a test set
of 60% from class A and 40% from class B. Then the accuracy will fall, and we
will get an accuracy of 60%.

• Classification accuracy is good but it gives a False Positive sense of achieving high
accuracy. The problem arises due to the possibility of misclassification of minor
class samples being very high.
Confusion Matrix

• A confusion matrix is a matrix that summarizes the performance of a machine


learning model on a set of test data.

• It provides a summary of the prediction results on a classification problem by


comparing actual and predicted values.

• The matrix is a table with four different combinations of predicted and actual
values.
Confusion Matrix

The matrix displays the number of instances produced by the model on the test data.

• True positives (TP): occur when the model accurately predicts a positive data point.

• True negatives (TN): occur when the model accurately predicts a negative data point.

• False positives (FP): occur when the model predicts a positive data point incorrectly.

• False negatives (FN): occur when the model predicts a negative data point incorrectly.

• The accuracy of the matrix is always calculated by taking average values present in the main diagonal i.e.

𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = (𝑇𝑟𝑢𝑒𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒+𝑇𝑟𝑢𝑒𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒) / 𝑇𝑜𝑡𝑎𝑙𝑆𝑎𝑚𝑝𝑙𝑒𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦


Confusion Matrix
Precision

• Precision is a measure of a model’s performance that tells you how many of the

positive predictions made by the model are actually correct. It is calculated as the

number of true positive predictions divided by the number of true positive and

false positive predictions.

Precision = TP / TP + FP
Recall

• Recall represents how well a model can identify actual positive cases. It measures

the ability of the model to find all the positive instances. It answers the question:

"Of all the actual positive instances, how many did the model correctly identify?"

Formula:

Recall = True Positives (TP) / True Positives (TP)+False Negatives (FN)


F1 - Score

• The F1 score is calculated as the harmonic mean of precision and recall.

• Harmonic Mean is the type of mean that is used when we have to find the
average rate of change, it is the mean calculated by taking the reciprocal values
of the given value and then dividing the number of terms by the sum of the
reciprocal values.

• The regular mean treats all values equally, the harmonic mean gives much more
weight to low values.
ROC & AUC Curve

• ROC stands for Receiver Operating Characteristics, and the ROC curve is the
graphical representation of the effectiveness of the binary classification model. It
plots the true positive rate (TPR) vs the false positive rate (FPR) at different
classification thresholds.

The curve is plotted between two parameters, which are:


•True Positive Rate or TPR
•False Positive Rate or FPR
In the curve, TPR is plotted on Y-axis, whereas FPR is on the X-axis.

You might also like