0% found this document useful (0 votes)

19 views40 pages

AI Algorithm

Ce sont des exercices de quatrième année informatique en intelligence artificielle, construction de compilateur et large système environment

Uploaded by

wansihapi.emmanuel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views40 pages

AI Algorithm

Ce sont des exercices de quatrième année informatique en intelligence artificielle, construction de compilateur et large système environment

Uploaded by

wansihapi.emmanuel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 40

Linear Regression with Python

Implementation
Table of Content

1. What is Regression?
2. What is Linear Regression (LR)?
3. Basic assumption of Linear Regression (LR)
4 . Implementation of Linear Regression
Linear Regression with Python
Implementation
1.What is Regression?
Regression analysis is a statistical method that
helps us to understand the relationship between
dependent and one or more independent variables,

Dependent Variable
This is the Main Factor that we are trying to predict.

Independent Variable
These are the variables that have a relationship
with the dependent variable.
Linear Regression with Python
Implementation
2. What is Linear Regression?
In Machine Learning lingo, Linear Regression (LR)
means simply finding the best fitting line that explains
the variability between the dependent and independent
features very well or we can say it describes the linear
relationship between independent and dependent
features, and in linear regression, the algorithm
predicts the continuous features(e.g. Salary,
Price ), rather than deal with the categorical features
(e.g. cat, dog).
Types of Regression Analysis
There are many types of regression analysis, but in this
article, we will deal with,
1. Simple Linear Regression
Linear Regression with Python Implementation
Simple Linear Regression
Simple Linear Regression uses the slope-intercept
(weight-bias) form, where our model needs to find
the optimal value for both slope and intercept. So
with the optimal values, the model can find the
variability between the independent and
dependent features and produce accurate results.
In simple linear regression, the model takes a
single independent and dependent variable.
There are many equations to represent a straight
line, we will stick with the common equation,

Here, y and x are the dependent variables, and

independent variables respectively. b1(m) and
b0(c) are slope and y-intercept respectively.
Slope(m) tells, for one unit of increase in x, How
many units does it increase in y. When the line is
steep, the slope will be higher, the slope will be
lower for the less steep line.
Constant(c) means, What is the value of y when the
x is zero.
Linear Regression with Python
HowImplementation
the Model will Select the Best
Fit Line?

First, our model will try a bunch of

different straight lines from that it
finds the optimal line that predicts
our data points well.

From the nearby picture, you can

notice there are 4 lines, and any
guess which will be our best fit line?

Ok, For finding the best fit line our

model uses the cost function. In
machine learning, every algorithm https://corporatefinanceinstitute.com/
multiple-linear-regression
has a cost function, and in simple
linear regression, the goal of our
Linear Regression with Python
Implementation
every algorithm has a
cost function, and in
simple linear
regression, the goal of
our algorithm is to find Yi – Actual value,
a minimal value for the Y^i – Predicted value,
n – number of records.
cost function.
And in linear ( yi – yi_hat ) is a Loss Function. And you
can find in most times people will
regression (LR), we interchangeably use the word loss and cost
function. But they are different, and we are
have many cost squaring the terms to neglect the negative
functions, but mostly value.
used cost function is
MSE(Mean Squared
Error). It is also known
Linear Regression with Python
Implementation
How the Model will
Select the Best Fit
Line?

Loss Function
It is a calculation of
loss for single training
data. Steps
Our model will fit all possible lines and find an
overall average error between the actual and
predicted values for each line respectively.
Cost Function Selects the line which has the lowest overall
It is a calculation of error. And that will be the best fit line.

average loss over the

Linear Regression with Python
Implementation
From the nearby picture, blue data
points are representing the actual
values from training data, a red
line(vector) is the predicted value
for that actual blue data point. we
can notice a random error, the
actual value-predicted value,
model is trying to minimize the
error between the actual and
predicted value. Because in the
real world we need a model, which Steps
Our model will fit all possible lines and find
makes the prediction very well. So an overall average error between the actual
our model will find the loss and predicted values for each line
between all the actual and respectively.
Selects the line which has the lowest overall
predicted values respectively. And error. And that will be the best fit line.
it selects the line which has an
average error of all points lower.
Linear Regression with Python
Multiple Linear Implementation
3. Assumption of Linear
Regression Regression
In multiple linear regression, our
model will apply the same steps.
Linearity
The first and most obvious assumption
In multiple linear regression
of our model is linearity.
instead of having a single
• It means, there must be a linear
independent variable, the model
relationship between the dependent
has multiple independent
and independent features.
variables to predict the
dependent variable.
• Without a Linear relationship,
accurate predictions won’t be
possible. The most commonly used
method to find the linear relationship
where bo is the y-intercept,
is a correlation, Scatterplot.
b1,b2,b3,b4…,bn are slopes of the
independent variables • A correlation provides information on
x1,x2,x3,x4…,xn and y is the the strength and direction of the
dependent variable. linear relationship between two
Here instead of finding a line, our variables.
model will find the best plane in
Linear Regression with Python
Implementation3. Assumption of Linear
Multicollinearity
Regression Linearity Normality The Independent variables
The first and most Normality doesn’t mean should not be correlated with
obvious assumption of our independent each other, when they are
our model is linearity. It variable should be correlated with each other,
means, there must be a normally distributed. then we could conclude that
linear relationship Linear Regression can one variable explains another
between the dependent work perfectly with non- variable well. So we don’t
and independent normal distribution. need two variables doing the
features. Without a Normality means our same thing. eg. You have two
Linear relationship, errors(residuals) should pens, both the pens do the
accurate predictions be normally distributed. same which is writing so you
won’t be possible. The We can get the errors of don’t need two pens for
most commonly used the model in the writing (consider two pens are
method to find the statsmodels using the magical pens, so you won’t
linear relationship is a below code. run out of ink). Before
correlation, Scatterplot. We can use Histogram dropping one of the two
A correlation provides and statsmodels Q-Q variables you must also see
information on the plot to check the how much does both
strength and direction probability distribution independent variables are
4. Implementation of Linear
Regression
An Introduction to K-Means Clustering
K-means clustering is a method that comes from signal processing
and uses the k-means algorithm. It aims to group a set of n
observations into k clusters. Each observation is placed in the
cluster where the average (or center) of the cluster is closest to
it, making that cluster its representative.

k-means clustering is one of the most popular ways to group data.

earning Objectives

-Get introduced to K-Means Clustering.

-Understand the properties of clusters and the various
evaluation metrics for clustering.
-Get acquainted with some of the many real-world
applications of K-Means Clustering.
-Implement K-Means Clustering in Python on a real-world
dataset.
An Introduction to K-Means Clustering
1.What is K-Means Clustering?
2.How K-Means Clustering Works?
3.Objective of k means Clustering
4.What is Clustering?
5.Properties of K means Clustering
6.Understanding the Different Evaluation Metrics for
Clustering
7.How to Apply K-Means Clustering Algorithm?
8.Implementing K-Means Clustering in Python From Scratch
9.Challenges With the K-Means Clustering Algorithm
10.K-Means++ to Choose Initial Cluster Centroids for K-Means
Clustering
An Introduction to K-Means Clustering
1.What is K-Means Clustering?

• K-means clustering is a popular unsupervised

machine learning algorithm used for partitioning a dataset
into a pre-defined number of clusters. The goal is to group
similar data points together and discover underlying patterns
or structures within the data.

• Recall the first property of clusters – it states that the points

within a cluster should be similar to each other. So, our aim
here is to minimize the distance between the points within a
cluster.

• The main objective of the K-Means algorithm is to minimize

the sum of distances between the points and their respective
An Introduction to K-Means Clustering
2.How K-Means Clustering Works?
Here’s how it works:
Initialization: Start by randomly selecting K points from the dataset.
These points will act as the initial cluster centroids.
Assignment: For each data point in the dataset, calculate the
distance between that point and each of the K centroids. Assign the
data point to the cluster whose centroid is closest to it. This step
effectively forms K clusters.
Update centroids: Once all data points have been assigned to
clusters, recalculate the centroids of the clusters by taking the
mean of all data points assigned to each cluster.
Repeat: Repeat steps 2 and 3 until convergence. Convergence
occurs when the centroids no longer change significantly or when a
specified number of iterations is reached.
Final Result: Once convergence is achieved, the algorithm outputs
the final cluster centroids and the assignment of each data point to
An Introduction to K-Means Clustering
3.Objective of k means Clustering
The main objective of k-means clustering is to partition your data
into a specific number (k) of groups, where data points within each
group are similar and dissimilar to points in other groups. It
achieves this by minimizing the distance between data points and
their assigned cluster’s center, called the centroid.
Here’s an objective:
Grouping similar data points: K-means aims to identify patterns in
your data by grouping data points that share similar
characteristics together. This allows you to discover underlying
structures within the data.
Minimizing within-cluster distance: The algorithm strives to make
sure data points within a cluster are as close as possible to each
other, as measured by a distance metric (usually Euclidean
distance). This ensures tight-knit clusters with high cohesiveness.
Maximizing between-cluster distance: Conversely, k-means also
tries to maximize the separation between clusters. Ideally, data
An Introduction to K-Means Clustering
4. What is Clustering?
Cluster analysis is a technique in data mining and machine learning
that groups similar objects into clusters. K-means clustering, a
popular method, aims to divide a set of objects into K clusters,
minimizing the sum of squared distances between the objects and
their respective cluster centers.
let’s try understanding this with a simple example. A bank wants to give
credit card offers to its customers. Currently, they look at the details of each
customer and, based on this information, decide which offer should be given
to which customer.
Now, the bank can potentially have millions of customers. Does it make sense
to look at the details of each customer separately and then make a decision?
Certainly not! It is a manual process and will take a huge amount of time.
An Introduction to K-Means Clustering
5. Properties of K means
Clustering
Properties of K means Clustering
How about another example of k-
means clustering algorithm?
We’ll take the same bank as
before, which wants to segment
its customers. For simplicity
purposes, let’s say the bank only
wants to use the income and
debt to make the segmentation.
They collected the customer data
and used a scatter plot to
On the X-axis, we have the income of the
visualize it:
customer, and the y-axis represents the amount of
debt. Here, we can clearly visualize that these
customers can be segmented into 4 different
clusters, as shown nearby:
An Introduction to K-Means Clustering
5. Properties of K means Clustering
• All the data points in a cluster
should be similar to each other.

• The data points from different

clusters should be as different as
possible.

On the X-axis, we have the income of the

customer, and the y-axis represents the amount of
debt. Here, we can clearly visualize that these
customers can be segmented into 4 different
clusters, as shown nearby:
An Introduction to K-Means Clustering
6. Understanding the Different Evaluation Metrics for Clustering

Inertia
Recall the first property of clusters we covered above.
• It tells us how far the points within a cluster are.
So, inertia actually calculates the sum of distances of all the
points within a cluster from the centroid of that cluster.
Normally, we use Euclidean distance as the distance metric, as
long as most of the features are numeric; otherwise, Manhattan
distance in case most of the features are categorical.
We calculate this for all the clusters; the final inertial value is
the sum of all these distances. This distance within the clusters
is known as intracluster distance. So, inertia gives us the sum of
intracluster distances:
Keeping this in mind, we can say that the lesser the inertia
An Introduction to K-Means Clustering
6. Understanding the Different Evaluation
Metrics for Clustering
Dunn Index
inertia makes sure that the first property of
clusters is satisfied. But it does not care about
the second property
– that different clusters should be as different
from each other as possible.
This is where the Dunn index comes into action.

Along with the distance between the centroid and

points, the Dunn index also takes into account the
distance between two clusters. This distance
between the centroids of two different clusters is
known as inter-cluster distance. Let’s look at the
formula of the Dunn index:

Dunn index is the ratio of the minimum of inter-

cluster distances and maximum of intracluster
distances.
An Introduction to K-Means Clustering
7. How to Apply K-Means Clustering Algorithm?
Let’s now take an example to understand how K-
Means actually works
We have these 8 points, and we want to apply k-
means to create clusters for these points. Here’s how
we can do it.
1.Choose the number of clusters k
The first step in k-means is to pick the number of
clusters, k.
2. Select k random points from the data as centroids
Next, we randomly select the centroid for each
cluster. Let’s say we want to have 2 clusters, so k is
equal to 2 here. We then randomly select the
centroid:
Here, the red and green circles represent the
centroid for these clusters.
An Introduction to K-Means Clustering
7. How to Apply K-Means Clustering Algorithm?
4. Recompute the centroids of newly formed clusters
Now, once we have assigned all of the points to either cluster, the next
step is to compute the centroids of newly formed clusters:
Here, the red and green crosses are the new centroids.
5.Repeat steps 3 and 4
We then repeat steps 3 and 4:

The step of computing the centroid and assigning all the points
to the cluster based on their distance from the centroid is a
single iteration. But wait – when should we stop this process? It
can’t run till eternity, right?

Stopping Criteria for K-Means Clustering

There are essentially three stopping criteria that can be adopted to stop
the K-means algorithm:
Centroids of newly formed clusters do not change
Points remain in the same cluster
Decision Tree
1.What is a Decision Tree?
2.Types of Decision Tree
3.Decision Tree Terminologies
4.How decision tree algorithms work?
5.Decision Tree Assumptions
6.Entropy
7.How do Decision Trees use Entropy?
8.Information Gain
9.When to Stop Splitting?
10.Pruning
11.Decision tree example
Decision Tree
1.What is a Decision Tree?
A decision tree is a hierarchical structure that
uses a flowchart like a tree structure to show
the predictions that result from a series of
feature-based splits.

2. Types of Decision Tree

ID3 : This algorithm measures how mixed up
the data is at a node using something called
entropy. It then chooses the feature that
helps to clarify the data the most.
C4.5 : This is an improved version of ID3 that
can handle missing data and continuous
attributes.

CART : This algorithm uses a different

measure called Gini impurity to decide how to
split the data. It can be used for both
classification (sorting data into categories)
Decision Tree
3. Decision Tree Terminologies
Before learning more about decision trees let’s get familiar
with some of the terminologies:

Root Node:

Decision Nodes:

Leaf Nodes:

Sub-Tree:

Pruning:

Branch / Sub-Tree:
Decision Tree
3. Decision Tree Terminologies
Before learning more about decision trees let’s get familiar with some of the
terminologies:

Root Node:
The initial node at the beginning of a decision tree, where the entire
population or dataset starts dividing based on various features or conditions.

Decision Nodes:
Nodes resulting from the splitting of root nodes are known as decision nodes.
These nodes represent intermediate decisions or conditions within the tree.

Leaf Nodes:
Nodes where further splitting is not possible, often indicating the final
classification or outcome. Leaf nodes are also referred to as terminal nodes.

Sub-Tree:
Similar to a subsection of a graph being called a sub-graph, a sub-section of a
Decision Tree
3. Decision Tree Terminologies
Before learning more about decision trees let’s get familiar with some
of the terminologies:

Pruning:
The process of removing or cutting down specific nodes in a decision
tree to prevent overfitting and simplify the model.

Branch / Sub-Tree:
A subsection of the entire decision tree is referred to as a branch or
sub-tree. It represents a specific path of decisions and outcomes within
the tree.

Parent and Child Node:

In a decision tree, a node that is divided into sub-nodes is known as a
parent node, and the sub-nodes emerging from it are referred to as
child nodes. The parent node represents a decision or condition, while
the child nodes represent the potential outcomes or further decisions
Example of Decision Tree
Decision Tree
Let’s understand decision trees with the help of an example:
Example of Decision Tree Decision Tree
Let’s understand decision trees with the help of an example:
In the below diagram the tree will first ask what is the weather? Is
it sunny, cloudy, or rainy? If yes then it will go to the next feature
which is humidity and wind. It will again check if there is a strong
wind or weak, if it’s a weak wind and it’s rainy then the person may
go and play.
Decision Tree
4.How decision tree algorithms work?
Decision Tree algorithm works in simpler steps
Starting at the Root: The algorithm begins at the top,
called the “root node,” representing the entire dataset.
Asking the Best Questions: It looks for the most
important feature or question that splits the data into the
most distinct groups. This is like asking a question at a fork
in the tree.
Branching Out: Based on the answer to that question, it
divides the data into smaller subsets, creating new
branches. Each branch represents a possible route through
the tree.
Repeating the Process: The algorithm continues asking
questions and splitting the data at each branch until it
Decision Tree
5. Decision Tree Assumptions
Several assumptions are made to build effective models when
creating decision trees. These assumptions help guide the tree’s
construction and impact its performance. Here are some common
assumptions and considerations when creating decision trees:

Binary Splits
Decision trees typically make binary splits, meaning each node
divides the data into two subsets based on a single feature or
condition. This assumes that each decision can be represented as a
binary choice.

Recursive Partitioning
Decision trees use a recursive partitioning process, where each node
is divided into child nodes, and this process continues until a stopping
criterion is met. This assumes that data can be effectively subdivided
into smaller, more manageable subsets. classifications.
Decision Tree
5. Decision Tree Assumptions
Several assumptions are made

Feature Independence
Decision trees often assume that the features used for splitting nodes
are independent. In practice, feature independence may not hold, but
decision trees can still perform well if features are correlated.

Homogeneity
Decision trees aim to create homogeneous subgroups in each node,
meaning that the samples within a node are as similar as possible
regarding the target variable. This assumption helps in achieving
clear decisionboundaries.

Top-Down Greedy Approach

Decision trees are constructed using a top-down, greedy approach,
where each split is chosen to maximize information gain or minimize
Decision Tree
5. Decision Tree Assumptions
Several assumptions are made

Categorical and Numerical Features

Decision trees can handle both categorical and numerical features.
However, they may require different splitting strategies for each type.

Overfitting
Decision trees are prone to overfitting when they capture noise in the
data. Pruning and setting appropriate stopping criteria are used to
address this assumption.

Impurity Measures
Decision trees use impurity measures such as Gini impurity or entropy
to evaluate how well a split separates classes. The choice of impurity
measure can impact tree construction.
Decision Tree
5. Decision Tree Assumptions
Several assumptions are made

No Missing Values
Decision trees assume that there are no missing values in the dataset
or that missing values have been appropriately handled through
imputation or other methods.

Equal Importance of Features

Decision trees may assume equal importance for all features unless
feature scaling or weighting is applied to emphasize certain features.
Decision Tree
5. Decision Tree Assumptions
Several assumptions are made

No Outliers
Decision trees are sensitive to outliers, and extreme values can
influence their construction. Preprocessing or robust methods may be
needed to handle outliers effectively.

Sensitivity to Sample Size

Small datasets may lead to overfitting, and large datasets may result
in overly complex trees. The sample size and tree depth should be
balanced.
Decision Tree
6. Entropy
Entropy is nothing but the uncertainty in our dataset or measure of
disorder. Let me try to explain this with the help of an example.

Suppose you have a group of friends who decides which movie they
can watch together on Sunday. There are 2 choices for movies, one is
“Lucy” and the second is “Titanic” and now everyone has to tell their
choice.

After everyone gives their answer we see that “Lucy” gets 4 votes
and “Titanic” gets 5 votes. Which movie do we watch now? Isn’t it
hard to choose 1 movie now because the votes for both the movies
are somewhat equal.
Decision Tree
6. Entropy

This is exactly what we call disorderness, there is an equal number of

votes for both the movies, and we can’t really decide which movie we
should watch. It would have been much easier if the votes for “Lucy”
were 8 and for “Titanic” it was 2. Here we could easily say that the
majority of votes are for “Lucy” hence everyone will be watching this
movie.

In a decision tree, the output is mostly “yes” or “no”

The formula for Entropy is shown below:

Decision Tree
Decision Tree

Linear Regression - Everything You Need To Know About Linear Regression
No ratings yet
Linear Regression - Everything You Need To Know About Linear Regression
17 pages
Linear Regression
No ratings yet
Linear Regression
49 pages
ML & DA Unit2 - Notes
No ratings yet
ML & DA Unit2 - Notes
57 pages
Linear Regression
No ratings yet
Linear Regression
11 pages
Day.9 SML
No ratings yet
Day.9 SML
23 pages
ML Algorithm
No ratings yet
ML Algorithm
4 pages
ML Exp 1
No ratings yet
ML Exp 1
4 pages
Linear Regression
No ratings yet
Linear Regression
26 pages
Linear Regression - Jupyter Notebook
100% (3)
Linear Regression - Jupyter Notebook
56 pages
Linear Regression
No ratings yet
Linear Regression
24 pages
ML Exp 1
No ratings yet
ML Exp 1
6 pages
ML Unit
No ratings yet
ML Unit
23 pages
Lecture 9-10
No ratings yet
Lecture 9-10
28 pages
Linear Regression Explained
No ratings yet
Linear Regression Explained
14 pages
AAI Lecture 10 SP 25
No ratings yet
AAI Lecture 10 SP 25
37 pages
Linear Regression
No ratings yet
Linear Regression
18 pages
Linear Regression in Machine Learning
No ratings yet
Linear Regression in Machine Learning
5 pages
Regression Modelling
No ratings yet
Regression Modelling
25 pages
Linear Regression - 1st Draft
No ratings yet
Linear Regression - 1st Draft
5 pages
LECTURE Regression
No ratings yet
LECTURE Regression
12 pages
Linear Regression
No ratings yet
Linear Regression
18 pages
Supervised Machine Learning - Regression
No ratings yet
Supervised Machine Learning - Regression
34 pages
DA Notes 3
No ratings yet
DA Notes 3
12 pages
02 Linear Regression Models
No ratings yet
02 Linear Regression Models
206 pages
Regression Analysis
No ratings yet
Regression Analysis
16 pages
Module 2 Notes
No ratings yet
Module 2 Notes
4 pages
Mod3 Eda
No ratings yet
Mod3 Eda
16 pages
Deep Learning
No ratings yet
Deep Learning
7 pages
Regression Coeffient
No ratings yet
Regression Coeffient
52 pages
AIML MSE 2 Notes
No ratings yet
AIML MSE 2 Notes
35 pages
ML - Module 2
No ratings yet
ML - Module 2
16 pages
Linear Regression
No ratings yet
Linear Regression
36 pages
Everything You Need To Know About Linear Regression
No ratings yet
Everything You Need To Know About Linear Regression
19 pages
Linear Regression: What Is Regression Analysis?
100% (1)
Linear Regression: What Is Regression Analysis?
21 pages
Linear Regression for Data Scientists
No ratings yet
Linear Regression for Data Scientists
28 pages
L4a - Supervised Learning
No ratings yet
L4a - Supervised Learning
25 pages
Unit-2 Supervised Machine Learning
No ratings yet
Unit-2 Supervised Machine Learning
132 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
13 pages
Linear Regression
No ratings yet
Linear Regression
15 pages
Unit - 3 Machine Learning
No ratings yet
Unit - 3 Machine Learning
30 pages
DS Unit-Iv
No ratings yet
DS Unit-Iv
34 pages
Unit 2
No ratings yet
Unit 2
26 pages
Machine Learning and Linear Regression
100% (1)
Machine Learning and Linear Regression
55 pages
Practical 5
No ratings yet
Practical 5
8 pages
Regression
No ratings yet
Regression
4 pages
Supervised Learning Algorithms
No ratings yet
Supervised Learning Algorithms
20 pages
Applying Machine Learning Algorithms With Scikit-Learn (Sklearn) - Notes
No ratings yet
Applying Machine Learning Algorithms With Scikit-Learn (Sklearn) - Notes
19 pages
OE-ML Unit - 3
No ratings yet
OE-ML Unit - 3
29 pages
Unit 2
No ratings yet
Unit 2
18 pages
Linear Regression
No ratings yet
Linear Regression
8 pages
Chapter4 Regression
No ratings yet
Chapter4 Regression
15 pages
Hanan
No ratings yet
Hanan
9 pages
Linear & Polynomial Regression Guide
No ratings yet
Linear & Polynomial Regression Guide
56 pages
(Unit-04) Part-01 - ML Algo
No ratings yet
(Unit-04) Part-01 - ML Algo
49 pages
MachineLearning Unit-II
No ratings yet
MachineLearning Unit-II
45 pages
18-Linear Regression
No ratings yet
18-Linear Regression
29 pages
Supervised Learning Essentials
No ratings yet
Supervised Learning Essentials
30 pages
Large System Environment Assignment 2
No ratings yet
Large System Environment Assignment 2
1 page
Language Theory Midterm Exam 2024
No ratings yet
Language Theory Midterm Exam 2024
1 page
Enterprise Management System Guide
No ratings yet
Enterprise Management System Guide
4 pages
AI Algorithm Exercise & Solution
No ratings yet
AI Algorithm Exercise & Solution
8 pages
AI Algoritm Course
No ratings yet
AI Algoritm Course
19 pages
Software Project Management
No ratings yet
Software Project Management
118 pages
Ai Uct Course 2024
No ratings yet
Ai Uct Course 2024
21 pages
Mediating Effect of Use Perceptions On Technology Readiness and Adoption of Artificial Intelligence in Accounting
No ratings yet
Mediating Effect of Use Perceptions On Technology Readiness and Adoption of Artificial Intelligence in Accounting
25 pages
The Impact of Brain Drain On Economic Growth Addressing
No ratings yet
The Impact of Brain Drain On Economic Growth Addressing
25 pages
Hedge Effectiveness for Accountants
No ratings yet
Hedge Effectiveness for Accountants
8 pages
Estimation of Soil Compaction Parameters by Using Statistical Analyses and Arti Cial Neural Networks
No ratings yet
Estimation of Soil Compaction Parameters by Using Statistical Analyses and Arti Cial Neural Networks
13 pages
AI HL Paper 3
No ratings yet
AI HL Paper 3
5 pages
HELM Workbook 43 Regression and Correlation
No ratings yet
HELM Workbook 43 Regression and Correlation
32 pages
Chapter Two: Bivariate Regression Mode
100% (1)
Chapter Two: Bivariate Regression Mode
54 pages
Regression 1 - Estadistica, Spring 2022 - WebAssign
No ratings yet
Regression 1 - Estadistica, Spring 2022 - WebAssign
15 pages
Understanding Multicollinearity in Regression
No ratings yet
Understanding Multicollinearity in Regression
8 pages
Product Cost Estimation: Technique Classification and Methodology Review
No ratings yet
Product Cost Estimation: Technique Classification and Methodology Review
13 pages
Mini Project Report
No ratings yet
Mini Project Report
11 pages
Unit 1-Week2: Linear Regression, Bias, Variance, Under and Over Fitting, Curse of Dimensionality and ROC
No ratings yet
Unit 1-Week2: Linear Regression, Bias, Variance, Under and Over Fitting, Curse of Dimensionality and ROC
53 pages
Cost Concepts and Analysis Guide
No ratings yet
Cost Concepts and Analysis Guide
6 pages
Declarations: Instructions
No ratings yet
Declarations: Instructions
23 pages
Evaluation of Mangrove Ecosystem Importance For Local Livelihoods
No ratings yet
Evaluation of Mangrove Ecosystem Importance For Local Livelihoods
12 pages
Advanced Statistics in Criminology and Criminal Justice 5th Edition David Weisburd David B Wilson Alese Wooditch Chester Britt
100% (3)
Advanced Statistics in Criminology and Criminal Justice 5th Edition David Weisburd David B Wilson Alese Wooditch Chester Britt
50 pages
Exploring Factors Influencing E-Hailing Services in Klang Valley, Malaysia
No ratings yet
Exploring Factors Influencing E-Hailing Services in Klang Valley, Malaysia
15 pages
Basic Statistics Tales of Distributions 10th Edition Chris Spatz PDF Download
100% (1)
Basic Statistics Tales of Distributions 10th Edition Chris Spatz PDF Download
51 pages
MTH302 Mcqs FinalTerm by Vu Topper RM
No ratings yet
MTH302 Mcqs FinalTerm by Vu Topper RM
37 pages
Land Valuation in Consolidation
No ratings yet
Land Valuation in Consolidation
22 pages
Chapter 2
No ratings yet
Chapter 2
73 pages
STAT 445 Regression Analysis
No ratings yet
STAT 445 Regression Analysis
49 pages
@vtucode - in 21CS71 Module 5 PDF
No ratings yet
@vtucode - in 21CS71 Module 5 PDF
5 pages
Babd End-Term
No ratings yet
Babd End-Term
43 pages
The - Impact - of - Railway - Networks - On - Residential - Land Value
No ratings yet
The - Impact - of - Railway - Networks - On - Residential - Land Value
9 pages
Preventive Maintenance and Fault Detection For Win PDF
No ratings yet
Preventive Maintenance and Fault Detection For Win PDF
24 pages
House Price Prediction for Investors
No ratings yet
House Price Prediction for Investors
3 pages
Data Analysis Book
No ratings yet
Data Analysis Book
88 pages
MTP 41 56 Questions 1734862078
No ratings yet
MTP 41 56 Questions 1734862078
19 pages
Correlation and Regression
No ratings yet
Correlation and Regression
45 pages

AI Algorithm

Uploaded by

AI Algorithm

Uploaded by

Linear Regression with Python

Here, y and x are the dependent variables, and

First, our model will try a bunch of

From the nearby picture, you can

Ok, For finding the best fit line our

average loss over the

k-means clustering is one of the most popular ways to group data.

-Get introduced to K-Means Clustering.

• K-means clustering is a popular unsupervised

• Recall the first property of clusters – it states that the points

• The main objective of the K-Means algorithm is to minimize

• The data points from different

On the X-axis, we have the income of the

Along with the distance between the centroid and

Dunn index is the ratio of the minimum of inter-

Stopping Criteria for K-Means Clustering

2. Types of Decision Tree

CART : This algorithm uses a different

Parent and Child Node:

Top-Down Greedy Approach

Categorical and Numerical Features

Equal Importance of Features

Sensitivity to Sample Size

This is exactly what we call disorderness, there is an equal number of

In a decision tree, the output is mostly “yes” or “no”

The formula for Entropy is shown below:

You might also like