0% found this document useful (0 votes)

55 views14 pages

Python For Data Science IA 1 Programs

The document provides implementations and explanations of various machine learning algorithms, including Simple Linear Regression, K-Nearest Neighbors (KNN), K-Means Clustering, and Naïve Bayes. Each section includes code snippets for generating datasets, training models, making predictions, and evaluating performance, along with step-by-step breakdowns of the processes involved. Visualizations using matplotlib are also included to illustrate the results of the algorithms.

Uploaded by

prerana.basavraj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

55 views14 pages

Python For Data Science IA 1 Programs

Uploaded by

prerana.basavraj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

Simple linear regression

import numpy as np
import [Link] as plt
from sklearn.model_selection import train_test_split
from [Link] import mean_squared_error

def generate_dataset(n_samples=100):
[Link](42)
X = 2 * [Link](n_samples, 1)
y = 3 * X + 4 + [Link](n_samples, 1)

class SimpleLinearRegression:
def __init__(self):
[Link] = None
[Link] = None
def fit(self, X, y):
n = len(X)
X_mean = [Link](X)
y_mean = [Link](y)

numerator = [Link]((X - X_mean) * (y - y_mean))

denominator = [Link]((X - X_mean) ** 2)

[Link] = numerator / denominator

[Link] = y_mean - [Link] * X_mean

def predict(self, X):

return [Link] * X + [Link]

if __name__ == "__main__":
X, y = generate_dataset()
dataset = [Link]({
"X": [Link](),
"y": [Link]()
})
print("Dataset:")
print(dataset)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,

random_state=42)

model = SimpleLinearRegression()
[Link](X_train.flatten(), y_train.flatten())

y_pred = [Link](X_test.flatten())
mse = mean_squared_error(y_test, y_pred)

print(f"Model Coefficients: Slope = {[Link]:.2f}, Intercept =

{[Link]:.2f}") print(f"Mean Squared Error on Test Set:
{mse:.2f}")

[Link](X, y, color="blue", label="Actual Data")

[Link](X, [Link]([Link]()), color="red", label="Regression Line")
[Link]('X')
[Link]('y')
[Link]('Simple Linear Regression')
[Link]()
[Link]()

Explanation:
Step-by-step breakdown:
 Step 1: Importing Libraries
o numpy: Used for generating synthetic data and performing
numerical operations.
o [Link]: Used for visualizing the data and the regression
line.
o LinearRegression: This is the linear regression model from scikit-
learn that will be used to fit the data.
o train_test_split: Splits the dataset into training and testing sets.

o mean_squared_error: Used to evaluate the performance of the

model by computing the mean squared error.
 Step 2: Generating Synthetic Data
o We generate synthetic data using the equation y=3x+4y = 3x +
4y=3x+4 with some added Gaussian noise. This helps simulate real-
world data where the relationship between variables is linear but
with some randomness.
o X contains the feature values (input), and y contains the target
values (output).
 Step 3: Splitting Data
o train_test_split() divides the data into training and testing sets. 80%
of the data is used for training, and 20% is used for testing.
 Step 4: Initializing the Model
o We create an instance of the LinearRegression class to initialize the
linear regression model.
 Step 5: Training the Model
o linear_reg.fit(X_train, y_train) fits the model to the training data,
learning the coefficients (slope and intercept) that best fit the linear
relationship between X and y.
 Step 6: Making Predictions
o y_pred = linear_reg.predict(X_test) predicts the target values
(y_pred) for the test data (X_test).
 Step 7: Evaluating the Model
o Mean Squared Error (MSE) is used to measure how well the
model fits the data. A lower MSE indicates a better fit.
o R-squared measures the proportion of the variance in the target
variable that is predictable from the features. A value closer to 1
indicates a good fit.
 Step 8: Visualizing the Results
o We use matplotlib to visualize the test data points (X_test vs. y_test)
and the regression line that represents the model's predictions.

How Linear Regression Works:

Linear regression attempts to model the relationship between a dependent
variable y and an independent variable X by fitting a straight line to the data.
The relationship is described by the equation:

y=β0+β1⋅Xy = \beta_0 + \beta_1 \cdot Xy=β0+β1⋅X

Where:
 y is the target variable (output),
 X is the input feature (independent variable),
 β0\beta_0β0 is the intercept (where the line crosses the y-axis),
 β1\beta_1β1 is the slope of the line.
The goal of the algorithm is to find the values of β0\beta_0β0 and β1\beta_1β1
that minimize the difference between the predicted values ypredy_{pred}ypred
and the actual values yyy (using a loss function like Mean Squared Error).

KNN program
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from [Link] import accuracy_score
from [Link] import load_iris

class KNNClassifier:
def __init__(self, k=3):
self.k = k self.X_train = None
self.y_train = None

def fit(self, X, y):

self.X_train = X
self.y_train = y

def predict(self, X):

predictions = []
for x in X:
distances = [Link](self.X_train - x, axis=1)
nearest_indices = [Link]()[:self.k]
nearest_labels = self.y_train[nearest_indices]
prediction = [Link](nearest_labels).argmax()
[Link](prediction)
return [Link](predictions)

iris = load_iris()
iris_df = [Link](data=np.c_[iris['data'], iris['target']]
columns=iris['feature_names'] + ['target'])
X = [Link]
y = [Link]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)

knn = KNNClassifier(k=3)
[Link](X_train, y_train)

y_pred = [Link](X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy * 100:.2f}%")

print("\nPredictions:")
for i, (true_label, pred_label) in enumerate(zip(y_test, y_pred)):
status = "Correct" if true_label == pred_label else "Incorrect"
print(f"Test Sample {i + 1}: True Label = {true_label}, Predicted =
{pred_label}, {status}")

Explanation:
 Step 1: Importing Libraries
o numpy: Used for handling arrays and matrix operations.

o train_test_split: This function splits the dataset into training and

testing subsets.
o KNeighborsClassifier: This is the KNN classifier from scikit-learn that
will be used for training and prediction.
o load_iris: A function to load the Iris dataset, which is a classic
dataset used for classification tasks.
o accuracy_score: This function calculates the accuracy of predictions
by comparing the predicted labels to the true labels.
 Step 2: Loading the Dataset
o We use load_iris() to load the Iris dataset, which is a simple
classification dataset where the goal is to predict the type of iris
flower (Setosa, Versicolour, or Virginica) based on four features
(sepal length, sepal width, petal length, petal width).
 Step 3: Splitting the Data
o train_test_split() splits the data into a training set and a test set
(80% for training and 20% for testing in this case). It shuffles the
data and ensures that we evaluate the model on unseen data.
 Step 4: Creating the KNN Classifier
o We create an instance of KNeighborsClassifier, setting the number
of neighbors n_neighbors=3. This means that the class of a new
data point will be predicted based on the majority class among the
3 nearest neighbors.
 Step 5: Training the Model
o [Link](X_train, y_train) trains the model using the training dataset
(X_train as input features and y_train as target labels).
 Step 6: Making Predictions
o [Link](X_test) makes predictions on the test data based on the
trained model. The X_test is the feature set for which we want to
predict the class labels.
 Step 7: Evaluating the Model
o accuracy_score(y_test, y_pred) compares the predicted labels
(y_pred) with the true labels (y_test) and calculates the accuracy.

How KNN Works:

KNN is a simple yet powerful classification algorithm:
 For each test data point:
1. It calculates the distance (usually Euclidean distance) from that
point to every other point in the training set.
2. Then, it selects the k nearest points.
3. The majority class among the k nearest neighbors is taken as the
prediction for the test data point.
K-means program

import numpy as np
import [Link] as plt

def initialize_centroids(X, k):

return X[[Link]([Link][0], k, replace=False)]

def compute_distance(a, b):

return [Link]([Link]((a - b) ** 2))

def assign_clusters(X, centroids):

clusters = []
for point in X:
distances = [compute_distance(point, centroid) for centroid in centroids]
cluster = [Link](distances) # Find the index of the nearest centroid
[Link](cluster)
return [Link](clusters)

def update_centroids(X, clusters, k):

new_centroids = [Link]((k, [Link][1]))
for i in range(k):
new_centroids[i] = [Link](X[clusters == i], axis=0)
return new_centroids

def k_means(X, k, max_iters=100, tolerance=1e-4):

centroids = initialize_centroids(X, k)
for i in range(max_iters):
clusters = assign_clusters(X, centroids)
new_centroids = update_centroids(X, clusters, k)
if [Link]([Link](new_centroids - centroids) < tolerance):
print(f"Converged at iteration {i}")
break
centroids = new_centroids
return centroids, clusters

from [Link] import make_blobs

X, _ = make_blobs(n_samples=300, centers=4, cluster_std=0.60,
random_state=0)

k=4
centroids, clusters = k_means(X, k)

[Link](X[:, 0], X[:, 1], c=clusters, cmap='viridis')

[Link](centroids[:, 0], centroids[:, 1], s=300, c='red', marker='X')
[Link]("K-Means Clustering (from scratch)")
[Link]('Feature 1')
[Link]('Feature 2')
[Link]()

Explanation:
Step-by-step breakdown:
 Step 1: Importing Libraries
o numpy: For handling numerical data and matrix operations.

o [Link]: For visualizing the data points and clusters.

o make_blobs: A function to generate synthetic data with a specified

number of clusters.
o KMeans: The KMeans algorithm from scikit-learn for clustering the
data.
 Step 2: Generate Data
o make_blobs() generates synthetic data with 300 samples, 4 centers
(clusters), and random variance. This is used to simulate real-world
clustering problems.
o X holds the generated data points, while y is the true label (not
used in K-Means, as it’s an unsupervised learning algorithm).
 Step 3: Visualizing Data Points
o The first [Link]() function plots the data points before applying
the clustering algorithm. They are all gray for now, and we use this
plot to see how the data looks before clustering.
 Step 4: K-Means Clustering
o KMeans(n_clusters=4) initializes the K-Means algorithm with 4
clusters (the number of clusters we want to form).
o [Link](X) trains the K-Means model using the data X.

 Step 5: Get Cluster Centroids and Labels

o kmeans.cluster_centers_: This gives the coordinates of the centroids
(the centers) of each of the 4 clusters.
o kmeans.labels_: This gives the predicted labels (cluster
assignments) for each data point. Each data point is assigned a
label that corresponds to the cluster it belongs to.
 Step 6: Visualize Clusters
o The second [Link]() function visualizes the clusters by coloring
each data point according to its assigned cluster (using the labels).
The centroids are highlighted in red with a X marker.
o This plot helps us visually confirm the clusters formed by the K-
Means algorithm.

How K-Means Works:

 Initialization:
o K-Means starts by randomly initializing k centroids (cluster centers).

 Iteration:
1. Assigning Labels: For each data point, it computes the distance from
the point to each centroid and assigns the point to the nearest
centroid (i.e., the cluster).
2. Recalculating Centroids: After assigning labels to all points, it
recalculates the centroids by averaging the points within each
cluster.
3. Repeat: Steps 1 and 2 are repeated iteratively until the centroids no
longer change (i.e., convergence is reached).
Naïve Bayes Program

import numpy as np
from [Link] import make_classification

class NaiveBayes:
def __init__(self):
self.class_probs = {}
self.class_means = {}
self.class_vars = {}

def fit(self, X, y):

# Get unique class labels
classes = [Link](y)

for c in classes:
self.class_probs[c] = [Link](y == c)

for c in classes:
X_c = X[y == c]
self.class_means[c] = [Link](X_c, axis=0)
self.class_vars[c] = [Link](X_c, axis=0)

def gaussian_pdf(self, x, mean, var):

return (1 / [Link](2 * [Link] * var)) * [Link](-(x - mean) ** 2 / (2 * var))

def predict(self, X):

predictions = []
for sample in X:
class_probs = {}
for c in self.class_probs:
prob = [Link](self.class_probs[c]) # Log prior P(class)
for i in range(len(sample)):
prob += [Link](self.gaussian_pdf(sample[i], self.class_means[c][i],
self.class_vars[c][i]))
class_probs[c] = prob
predicted_class = max(class_probs, key=class_probs.get)
[Link](predicted_class)

return [Link](predictions)

X, y = make_classification(n_samples=200, n_features=2, n_classes=2,

random_state=42)

nb = NaiveBayes()
[Link](X, y)

predictions = [Link](X)

accuracy = [Link](predictions == y)
print(f"Accuracy: {accuracy * 100:.2f}%")

Explanation:
Step-by-step breakdown:
 Step 1: Importing Libraries
o numpy: Used for numerical operations (though not directly used
here, it is used in the underlying data).
o train_test_split: This function splits the dataset into a training set
and a test set.
o GaussianNB: This is the Naive Bayes classifier for continuous
features (assuming Gaussian/Normal distribution).
o load_iris: A function to load the Iris dataset, which contains flower
data and their corresponding species.
o accuracy_score: This function calculates the accuracy of predictions
by comparing the predicted labels (y_pred) with the true labels
(y_test).
 Step 2: Loading the Dataset
o load_iris() loads the Iris dataset, which consists of 150 samples,
each containing 4 features (sepal length, sepal width, petal length,
petal width) and corresponding target labels (y), which represent
three species of Iris flowers.
 Step 3: Splitting the Data
o train_test_split() divides the data into a training set and a testing
set (with 70% training and 30% testing in this case). This helps in
evaluating the model on unseen data.
 Step 4: Initializing the Naive Bayes Model
o GaussianNB() initializes the Naive Bayes classifier that assumes the
features are normally distributed (Gaussian distribution).
 Step 5: Training the Model
o naive_bayes.fit(X_train, y_train) trains the Naive Bayes model using
the training data (X_train as input features and y_train as the target
labels).
 Step 6: Making Predictions
o naive_bayes.predict(X_test) predicts the labels for the test data
(X_test) based on the trained model.
 Step 7: Evaluating the Model
o accuracy_score(y_test, y_pred) compares the predicted labels
(y_pred) with the true labels (y_test) and calculates the accuracy.

How Naive Bayes Works:

Naive Bayes is a probabilistic classifier based on Bayes' Theorem, with the
"naive" assumption that all features are independent given the class label. It
works by computing the probability of each class given the features and
predicting the class with the highest probability.

 Bayes' Theorem: P(C∣X)=P(X∣C)P(C)P(X)P(C|X) = \frac{P(X|C) P(C)}

{P(X)}P(C∣X)=P(X)P(X∣C)P(C) Where:

o P(C∣X)P(C|X)P(C∣X) is the posterior probability of class CCC given the

features XXX.
o P(X∣C)P(X|C)P(X∣C) is the likelihood of the features XXX given the
class CCC.
o P(C)P(C)P(C) is the prior probability of class CCC.

o P(X)P(X)P(X) is the probability of the features XXX.

In practice, Naive Bayes estimates the probability of each class by assuming that
the features are conditionally independent. For the Gaussian Naive Bayes (used
here), it assumes the features are normally distributed and uses the mean and
variance of each feature to calculate the likelihood.

Python For Data Science IA 1 Programs
No ratings yet
Python For Data Science IA 1 Programs
14 pages
CP4252 Lab Manual
No ratings yet
CP4252 Lab Manual
13 pages
Machine Learning
100% (5)
Machine Learning
56 pages
cp4252 Machine Learning Lab Manual
No ratings yet
cp4252 Machine Learning Lab Manual
21 pages
Shubham Pract 6 - Merged
No ratings yet
Shubham Pract 6 - Merged
12 pages
Machine Learning LAB
No ratings yet
Machine Learning LAB
20 pages
LAB-4 Report
No ratings yet
LAB-4 Report
21 pages
Unit2 ML Programs
No ratings yet
Unit2 ML Programs
7 pages
Lab Manual
No ratings yet
Lab Manual
9 pages
Python ML Algorithms Guide
No ratings yet
Python ML Algorithms Guide
7 pages
Machine Learning Evaluation Guide
100% (1)
Machine Learning Evaluation Guide
504 pages
Python Code For KNN Classifier 1. Initial Message
No ratings yet
Python Code For KNN Classifier 1. Initial Message
7 pages
Advance AI and ML LAB
No ratings yet
Advance AI and ML LAB
16 pages
Minor Lab
No ratings yet
Minor Lab
4 pages
Machine Learning Final Manual
No ratings yet
Machine Learning Final Manual
45 pages
Aml Lab
No ratings yet
Aml Lab
6 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
20 pages
V
No ratings yet
V
8 pages
Building, Tuning, and Deploying Models
No ratings yet
Building, Tuning, and Deploying Models
11 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
23 pages
1st PGM
No ratings yet
1st PGM
10 pages
DSML
No ratings yet
DSML
14 pages
Machine Learning
No ratings yet
Machine Learning
8 pages
Linearregression SVM
No ratings yet
Linearregression SVM
3 pages
Data Analytics
No ratings yet
Data Analytics
10 pages
16BCB0126 VL2018195002535 Pe003
No ratings yet
16BCB0126 VL2018195002535 Pe003
40 pages
Assignment 4
No ratings yet
Assignment 4
9 pages
Apriori Algorithm
No ratings yet
Apriori Algorithm
12 pages
ML Lab
No ratings yet
ML Lab
23 pages
Record
No ratings yet
Record
23 pages
Data Preprocessing
No ratings yet
Data Preprocessing
9 pages
ML Journal External
No ratings yet
ML Journal External
14 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
44 pages
Case Study - Classifier
No ratings yet
Case Study - Classifier
5 pages
Argha's ML LAB - 240927 - 121838
No ratings yet
Argha's ML LAB - 240927 - 121838
13 pages
ML Full For Print New 1
No ratings yet
ML Full For Print New 1
38 pages
Programs Lab Bca
No ratings yet
Programs Lab Bca
16 pages
ML II Lab
No ratings yet
ML II Lab
5 pages
ML Minimized Programs
No ratings yet
ML Minimized Programs
9 pages
1
No ratings yet
1
13 pages
Aam Codes
No ratings yet
Aam Codes
8 pages
Supervised Learning Essentials
No ratings yet
Supervised Learning Essentials
102 pages
ML Lab Record8to15
No ratings yet
ML Lab Record8to15
23 pages
ML External Xerox
No ratings yet
ML External Xerox
1 page
ML Manual With Outputs
No ratings yet
ML Manual With Outputs
30 pages
ML File - 1
No ratings yet
ML File - 1
12 pages
ML Lab Mannual
No ratings yet
ML Lab Mannual
29 pages
Machine Learning Algorithms Guide
No ratings yet
Machine Learning Algorithms Guide
34 pages
KNN Algorithm Guide with Python
No ratings yet
KNN Algorithm Guide with Python
15 pages
Remaining ML Program
No ratings yet
Remaining ML Program
12 pages
ML Experiment WithDataset
No ratings yet
ML Experiment WithDataset
23 pages
ML Record
No ratings yet
ML Record
14 pages
Slide 2 ML Basics
No ratings yet
Slide 2 ML Basics
42 pages
DM ML Practical
No ratings yet
DM ML Practical
13 pages
Advanced Machine Learning Experiments
No ratings yet
Advanced Machine Learning Experiments
15 pages
Artificial Intelligence Advance Practical
No ratings yet
Artificial Intelligence Advance Practical
12 pages
Artificial Intelligence Lab 7
No ratings yet
Artificial Intelligence Lab 7
10 pages
Simulase Designer
No ratings yet
Simulase Designer
9 pages
CSE408 Longest Common Sub Sequence: Lecture # 25
No ratings yet
CSE408 Longest Common Sub Sequence: Lecture # 25
31 pages
Finite State Machines in Discrete Math
No ratings yet
Finite State Machines in Discrete Math
21 pages
CONICET Digital Nro. L
No ratings yet
CONICET Digital Nro. L
3 pages
T-Ticker Tape Only
No ratings yet
T-Ticker Tape Only
2 pages
Soil Structure Interaction (Ssi)
No ratings yet
Soil Structure Interaction (Ssi)
16 pages
V 5 M 1 U 11
No ratings yet
V 5 M 1 U 11
22 pages
Ma8551 Ant Question Bank For Unit - Iv With Solution
No ratings yet
Ma8551 Ant Question Bank For Unit - Iv With Solution
19 pages
Sta3702 - Sep - Oct - Nov - Dec - 2021 Online
No ratings yet
Sta3702 - Sep - Oct - Nov - Dec - 2021 Online
8 pages
Class 10th Math Notes Chapter 1 Federal Board.
No ratings yet
Class 10th Math Notes Chapter 1 Federal Board.
5 pages
J Applied Clin Med Phys - 2015 - Thanh - Dosimetric Characterization of The High Dose Rate Brachytherapy Source Using The
No ratings yet
J Applied Clin Med Phys - 2015 - Thanh - Dosimetric Characterization of The High Dose Rate Brachytherapy Source Using The
13 pages
Part A (2 Marks) : Department of Electronics and Communication Engineering Model Examination
No ratings yet
Part A (2 Marks) : Department of Electronics and Communication Engineering Model Examination
8 pages
MMW Module 3 2 Modular Arithmetic
No ratings yet
MMW Module 3 2 Modular Arithmetic
8 pages
CS 6515 HW1 Q1
No ratings yet
CS 6515 HW1 Q1
4 pages
Grade 9B Q4 Curriculum Overview
No ratings yet
Grade 9B Q4 Curriculum Overview
3 pages
Gilbreath Principle Card Tricks
No ratings yet
Gilbreath Principle Card Tricks
6 pages
1MDS06004 en en REL 100 RELZ 100 Numerical Line Protection Terminal
100% (1)
1MDS06004 en en REL 100 RELZ 100 Numerical Line Protection Terminal
8 pages
A Derivation of The Lorentz Transformation
No ratings yet
A Derivation of The Lorentz Transformation
10 pages
Perimeter & Area: Defining and Calculating
No ratings yet
Perimeter & Area: Defining and Calculating
19 pages
NEET 11th
No ratings yet
NEET 11th
8 pages
L1 - SLM Notes (Bacground, ML)
No ratings yet
L1 - SLM Notes (Bacground, ML)
29 pages
Atomic Structure Lesson Plan 1
No ratings yet
Atomic Structure Lesson Plan 1
3 pages
O-Level Math D Sample Question Paper - 1
No ratings yet
O-Level Math D Sample Question Paper - 1
21 pages
Differentiation Formulae List
No ratings yet
Differentiation Formulae List
2 pages
d2 1 PDF
No ratings yet
d2 1 PDF
4 pages
Decision Theory Problems
No ratings yet
Decision Theory Problems
7 pages
Math Competition for Students
No ratings yet
Math Competition for Students
2 pages
Statistics Week 5
No ratings yet
Statistics Week 5
25 pages
Intern J of Cosmetic Sci - 2025 - Isoir Ingrez - in Vivo Efficacy of A Stabilized Vitamin C Based Serum at PH 6 On Some
No ratings yet
Intern J of Cosmetic Sci - 2025 - Isoir Ingrez - in Vivo Efficacy of A Stabilized Vitamin C Based Serum at PH 6 On Some
10 pages
Crank-Nicolson Method for Heat PDEs
No ratings yet
Crank-Nicolson Method for Heat PDEs
30 pages

Python For Data Science IA 1 Programs

Uploaded by

Python For Data Science IA 1 Programs

Uploaded by

Simple linear regression

numerator = [Link]((X - X_mean) * (y - y_mean))

[Link] = numerator / denominator

def predict(self, X):

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,

print(f"Model Coefficients: Slope = {[Link]:.2f}, Intercept =

[Link](X, y, color="blue", label="Actual Data")

o mean_squared_error: Used to evaluate the performance of the

How Linear Regression Works:

y=β0+β1⋅Xy = \beta_0 + \beta_1 \cdot Xy=β0+β1⋅X

def fit(self, X, y):

def predict(self, X):

o train_test_split: This function splits the dataset into training and

How KNN Works:

def initialize_centroids(X, k):

def compute_distance(a, b):

def assign_clusters(X, centroids):

def update_centroids(X, clusters, k):

def k_means(X, k, max_iters=100, tolerance=1e-4):

from [Link] import make_blobs

[Link](X[:, 0], X[:, 1], c=clusters, cmap='viridis')

o [Link]: For visualizing the data points and clusters.

o make_blobs: A function to generate synthetic data with a specified

 Step 5: Get Cluster Centroids and Labels

How K-Means Works:

def fit(self, X, y):

def gaussian_pdf(self, x, mean, var):

def predict(self, X):

X, y = make_classification(n_samples=200, n_features=2, n_classes=2,

How Naive Bayes Works:

 Bayes' Theorem: P(C∣X)=P(X∣C)P(C)P(X)P(C|X) = \frac{P(X|C) P(C)}

o P(C∣X)P(C|X)P(C∣X) is the posterior probability of class CCC given the

o P(X)P(X)P(X) is the probability of the features XXX.

You might also like