AI algorithm
AI algorithm
Implementation
Table of Content
1. What is Regression?
2. What is Linear Regression (LR)?
3. Basic assumption of Linear Regression (LR)
4 . Implementation of Linear Regression
Linear Regression with Python
Implementation
1.What is Regression?
Regression analysis is a statistical method that
helps us to understand the relationship between
dependent and one or more independent variables,
Dependent Variable
This is the Main Factor that we are trying to predict.
Independent Variable
These are the variables that have a relationship
with the dependent variable.
Linear Regression with Python
Implementation
2. What is Linear Regression?
In Machine Learning lingo, Linear Regression (LR)
means simply finding the best fitting line that explains
the variability between the dependent and independent
features very well or we can say it describes the linear
relationship between independent and dependent
features, and in linear regression, the algorithm
predicts the continuous features(e.g. Salary,
Price ), rather than deal with the categorical features
(e.g. cat, dog).
Types of Regression Analysis
There are many types of regression analysis, but in this
article, we will deal with,
1. Simple Linear Regression
Linear Regression with Python Implementation
Simple Linear Regression
Simple Linear Regression uses the slope-intercept
(weight-bias) form, where our model needs to find
the optimal value for both slope and intercept. So
with the optimal values, the model can find the
variability between the independent and
dependent features and produce accurate results.
In simple linear regression, the model takes a
single independent and dependent variable.
There are many equations to represent a straight
line, we will stick with the common equation,
Loss Function
It is a calculation of
loss for single training
data. Steps
Our model will fit all possible lines and find an
overall average error between the actual and
predicted values for each line respectively.
Cost Function Selects the line which has the lowest overall
It is a calculation of error. And that will be the best fit line.
Inertia
Recall the first property of clusters we covered above.
• It tells us how far the points within a cluster are.
So, inertia actually calculates the sum of distances of all the
points within a cluster from the centroid of that cluster.
Normally, we use Euclidean distance as the distance metric, as
long as most of the features are numeric; otherwise, Manhattan
distance in case most of the features are categorical.
We calculate this for all the clusters; the final inertial value is
the sum of all these distances. This distance within the clusters
is known as intracluster distance. So, inertia gives us the sum of
intracluster distances:
Keeping this in mind, we can say that the lesser the inertia
An Introduction to K-Means Clustering
6. Understanding the Different Evaluation
Metrics for Clustering
Dunn Index
inertia makes sure that the first property of
clusters is satisfied. But it does not care about
the second property
– that different clusters should be as different
from each other as possible.
This is where the Dunn index comes into action.
The step of computing the centroid and assigning all the points
to the cluster based on their distance from the centroid is a
single iteration. But wait – when should we stop this process? It
can’t run till eternity, right?
Root Node:
Decision Nodes:
Leaf Nodes:
Sub-Tree:
Pruning:
Branch / Sub-Tree:
Decision Tree
3. Decision Tree Terminologies
Before learning more about decision trees let’s get familiar with some of the
terminologies:
Root Node:
The initial node at the beginning of a decision tree, where the entire
population or dataset starts dividing based on various features or conditions.
Decision Nodes:
Nodes resulting from the splitting of root nodes are known as decision nodes.
These nodes represent intermediate decisions or conditions within the tree.
Leaf Nodes:
Nodes where further splitting is not possible, often indicating the final
classification or outcome. Leaf nodes are also referred to as terminal nodes.
Sub-Tree:
Similar to a subsection of a graph being called a sub-graph, a sub-section of a
Decision Tree
3. Decision Tree Terminologies
Before learning more about decision trees let’s get familiar with some
of the terminologies:
Pruning:
The process of removing or cutting down specific nodes in a decision
tree to prevent overfitting and simplify the model.
Branch / Sub-Tree:
A subsection of the entire decision tree is referred to as a branch or
sub-tree. It represents a specific path of decisions and outcomes within
the tree.
Binary Splits
Decision trees typically make binary splits, meaning each node
divides the data into two subsets based on a single feature or
condition. This assumes that each decision can be represented as a
binary choice.
Recursive Partitioning
Decision trees use a recursive partitioning process, where each node
is divided into child nodes, and this process continues until a stopping
criterion is met. This assumes that data can be effectively subdivided
into smaller, more manageable subsets. classifications.
Decision Tree
5. Decision Tree Assumptions
Several assumptions are made
Feature Independence
Decision trees often assume that the features used for splitting nodes
are independent. In practice, feature independence may not hold, but
decision trees can still perform well if features are correlated.
Homogeneity
Decision trees aim to create homogeneous subgroups in each node,
meaning that the samples within a node are as similar as possible
regarding the target variable. This assumption helps in achieving
clear decisionboundaries.
Overfitting
Decision trees are prone to overfitting when they capture noise in the
data. Pruning and setting appropriate stopping criteria are used to
address this assumption.
Impurity Measures
Decision trees use impurity measures such as Gini impurity or entropy
to evaluate how well a split separates classes. The choice of impurity
measure can impact tree construction.
Decision Tree
5. Decision Tree Assumptions
Several assumptions are made
No Missing Values
Decision trees assume that there are no missing values in the dataset
or that missing values have been appropriately handled through
imputation or other methods.
No Outliers
Decision trees are sensitive to outliers, and extreme values can
influence their construction. Preprocessing or robust methods may be
needed to handle outliers effectively.
Suppose you have a group of friends who decides which movie they
can watch together on Sunday. There are 2 choices for movies, one is
“Lucy” and the second is “Titanic” and now everyone has to tell their
choice.
After everyone gives their answer we see that “Lucy” gets 4 votes
and “Titanic” gets 5 votes. Which movie do we watch now? Isn’t it
hard to choose 1 movie now because the votes for both the movies
are somewhat equal.
Decision Tree
6. Entropy