0% found this document useful (0 votes)

7 views51 pages

MLC Practical

Uploaded by

gargic1606

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views51 pages

MLC Practical

Uploaded by

gargic1606

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 51

EXPERIMENT NO: 1

Aim: Introduction to python - Installation, operators, decision making, loops

Theory: Python is an interpreted, object-oriented, high-level programming

language. It was created by Guido van Rossum, and released in 1991. Python is
simple, easy to learn syntax emphasises readability and therefore reduces the
cost of program maintenance.
Python is widely used in various fields, including web development, data
analysis, artificial intelligence, scientific computing, and more.
Features:
● Interpreted
- There are no separate compilation and execution steps like C and C++.
- Directly run the program from the source code.
- Internally, Python converts the source code into an intermediate form
called bytecodes which is then translated into native language of a
specific computer to run it.
- No need to worry about linking and loading with libraries, etc.
● Platform Independent
- Python programs can be developed and executed on multiple operating
system platforms.
- Python can be used on Linux, Windows, Macintosh, Solaris and many
more.
● High-level Language
- In Python, there is no need to take care about low-level details such as
managing the memory used by the program.
● Simple
- Closer to English language;Easy to Learn
- More emphasis on the solution to the problem rather than the syntax

Installation of Python in Windows 11:

Step 1: Go to the official Python website:
https://www.python.org/downloads/
Click on the "Download Python" button. The website will automatically
suggest the best version for your system.

Step 2: Run the Installer:

Once the download is complete, open the installer file.
In the installer window, make sure to check the box that says "Add Python to
PATH". This will allow you to run Python from the command line.
Click "Install Now".

Step 3: Verify the Installation:

1
Open Command Prompt.
Type python --version and press Enter. You should see the version of Python
that you installed.

Basic Syntax and Concepts of Python:

Basic Syntax

Operators

2
3
4
EXPERIMENT NO: 2

Aim: To study Python libraries used for machine learning

PANDAS
Pandas is a popular Python library for data analysis. It is not directly related
to Machine Learning. As we know that the dataset must be prepared before
training. In this case, Pandas comes handy as it was developed specifically for
data extraction and preparation. It provides high-level data structures and a
wide variety of tools for data analysis. It provides many inbuilt methods for
grouping, combining and filtering data.

NUMPY
NumPy is a very popular python library for large multi-dimensional array and
matrix processing, with the help of a large collection of high-level
mathematical functions. It is very useful for fundamental scientific
computations in Machine Learning. It is particularly useful for linear algebra,
Fourier transform, and random number capabilities. High-end libraries like
TensorFlow use NumPy internally for manipulation of Tensors.

5
MATPLOTLIB
Matplotlib is a very popular Python library for data visualisation. Like Pandas,
it is not directly related to Machine Learning. It particularly comes in handy
when a programmer wants to visualise the patterns in the data. It is a 2D
plotting library used for creating 2D graphs and plots. A module named pyplot
makes it easy for programmers for plotting as it provides features to control
line styles, font properties, formatting axes, etc. It provides various kinds of
graphs and plots for data visualisation, viz., histogram, error charts, bar
charts, etc,

6
SCIPY
SciPy is a very popular library among Machine Learning enthusiasts as it
contains different modules for optimization, linear algebra, integration and
statistics. There is a difference between the SciPy library and the SciPy stack.
The SciPy is one of the core packages that make up the SciPy stack. SciPy is
also very useful for image manipulation.

7
SCIKIT_LEARN
Scikit-learn is one of the most popular ML libraries for classical ML
algorithms. It is built on top of two basic Python libraries, viz., NumPy and
SciPy. Scikit-learn supports most of the supervised and unsupervised
learning algorithms. Scikit-learn can also be used for data-mining and
data-analysis, which makes it a great tool for those starting out with ML.

8
EXPERIMENT NO: 3

Aim: To study missing and filling data parts in python

Theory:
In data analysis, missing data is a frequent issue that can hinder the accuracy
of models and insights. Handling this data properly is essential to ensure data
quality and reliable results. In Python, libraries like Pandas provide powerful
tools to detect and manage missing values.

1. Identifying Missing Data:

Missing data is usually represented as ǸaǸ (Not a Number). Before taking
action, it is important to identify where these missing values occur in the
dataset.

2. Filling Missing Data:

There are several techniques to fill missing data:
- Constant Value Filling: Missing values can be replaced with a specific
constant, like zero or another relevant placeholder.
- Forward and Backward Filling: Missing values can be filled by propagating
the previous or next valid data point.
- Imputation: More advanced methods involve replacing missing data with
statistical measures like the mean, median, or mode of the column.

3. Dropping Missing Data:

Sometimes, it may be more effective to remove rows or columns with
missing data altogether, especially when the proportion of missing data is
significant.

By applying these techniques, analysts can ensure the dataset is complete and
suitable for further analysis, improving the overall accuracy and performance
of models.

Program:

[1] import pandas as pd

import numpy as np
df = pd.DataFrame (np.random.randn(5, 3), index = ['a', 'c', 'e', 'f',
'h'], columns = ['one', 'two', 'three'])
df = df.reindex(['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h'])
print(df)

one two three

a -0.587407 0.245445 -1.157601
b NaN NaN NaN

9
c -0.595357 -0.062141 -0.679225
d NaN NaN NaN
e 0.910208 -1.230797 0.191110
f -0.062459 0.092898 1.320681
g NaN NaN NaN
h 1.313131 -0.963366 0.358444

[2] import pandas as pd

import numpy as np
df = pd.DataFrame (np.random.randn(5, 3), index = ['a', 'c', 'e', 'f',
'h'], columns = ['one', 'two', 'three'])
df = df.reindex(['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h'])
print (df['one'].isnull())

a False
b True
c False
d True
e False
f False
g True
h False
Name: one, dtype: bool

[3] import pandas as pd

a True
b False
c True
d False
e True
f True
g False
h True
Name: one, dtype: bool

[4] import pandas as pd

2.6321528307002513

10
[5] import pandas as pd
import numpy as np
df = pd.DataFrame (np.random.randn(5, 3), index = [1, 2, 3, 4, 5],
columns = ['one', 'two', 'three'])
df = df.reindex(['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h'])
print (df['one'].sum())

0.0

[6] import pandas as pd

import numpy as np
df = pd.DataFrame (np.random.randn(3, 3), index = ['a', 'c', 'e'],
columns = ['one', 'two', 'three'])
df = df.reindex(['a', 'b', 'c'])
print (df)
print ("NaN replaced with '0': ")
print (df.fillna(0))

one two three

a 2.452051 -0.233571 -1.000200
b NaN NaN NaN
c -1.410490 0.543543 0.262501
NaN replaced with '0':
one two three
a 2.452051 -0.233571 -1.000200
b 0.000000 0.000000 0.000000
c -1.410490 0.543543 0.262501

[7] import pandas as pd

11
EXPERIMENT NO: 4

Aim: To write a program using Python to implement Linear Regression

(Single Variable and Multivariable)

Theory:
Linear regression is a fundamental statistical method used to model the
relationship between a dependent variable (target) and one or more
independent variables (predictors). It is one of the most basic and widely used
forms of predictive modeling in machine learning and statistics.

[1] import pandas as pd

import numpy as np
from sklearn import linear_model
import matplotlib.pyplot as plt

[2] df = pd.read_csv('homeprice.csv')

[3] df

Area Price
0 1000 230000
1 1300 270000
2 3000 620000
3 2600 570000
4 3200 660000
5 2100 510000

[4] plt.xlabel('Area')
plt.ylabel('Price')
plt.scatter(df.Area, df.Price, color='red', marker='+')

<matplotlib.collections.PathCollection at 0xec17800>

12
[5] x = df.iloc[:, 0].values.reshape(-1, 1)
x

array([[1000],
[1300],
[3000],
[2600],
[3200],
[2100]], dtype=int64)

[6] y = df.iloc[:, 1].values.reshape(-1,1)

array([[230000],
[270000],
[620000],
[570000],
[660000],
[510000]], dtype=int64)

[7] reg = linear_model.LinearRegression()

reg.fit(x, y)

LinearRegression()

13
Predict price of home with area = 3300 sq.ft.

[8] reg.predict([[3300]])

array([[697208.53858785]])

[9] reg.coef_

array([[200.49261084]])

[10] reg.intercept_

array([35582.9228243])

Y = m*x + b (m: coefficient, b: intercept)

[11] 3300*200.49261084 + 35582.92282430234

697208.5385963024

Predict price of home with area = 5000 sq.ft.

[12] reg.predict([[5000]])

array([[1038045.97701149]])

[13] y_pred = reg.predict(x)

Regression Line

[14] plt.scatter(x, y)
plt.plot(x, y_pred,color='red')
plt.show()

14
Generate CSV file with list of home price predictions.

[15] area_df = pd.read_csv("area.csv")

[16] area_df

Area
0 1100
1 1600
2 2000
3 2200
4 2400
5 2800
6 3400
7 4000

[17] p = reg.predict(area_df)

15
C:\Users\admin\anaconda3\Lib\site-packages\sklearn\base.py:486: UserWarning: X
has feature names, but LinearRegression was fitted without feature names
warnings.warn

[18] p

array([[256124.79474548],
[356371.1001642 ],
[436568.14449918],
[476666.66666667],
[516765.18883415],
[596962.23316913],
[717257.79967159],
[837553.36617406]])

[19] area_df['price']=p

[20] area_df

Area price
0 1100 256124.794745
1 1600 356371.100164
2 2000 436568.144499
3 2200 476666.666667
4 2400 516765.188834
5 2800 596962.233169
6 3400 717257.799672
7 4000 837553.366174

area_df.to_csv("prediction.csv")

16
EXPERIMENT NO: 5

Aim: To study and perform Multiple Linear Regression model.

Theory:
Simple Linear Regression, where a single Independent/Predictor(X) variable is
used to model the response variable (Y). But there may be various cases in
which the response variable is affected by more than one predictor variable;
for such cases, the Multiple Linear Regression algorithm is used. Moreover,
Multiple Linear Regression is an extension of Simple Linear regression as it
takes more than one predictor variable to predict the response variable. We
can define it as:
Definition: Multiple Linear Regression is one of the important regression
algorithms which models the linear relationship between a single dependent
continuous variable and more than one independent variable.

Example:
Prediction of CO2 emission based on engine size and number of cylinders in a
car.

Multiple Linear Regression

In Multiple Linear Regression, the target variable(Y) is a linear combination of
multiple predictor variables x1, x2, x3, ...,xn. Since it is an enhancement of
Simple Linear Regression, so the same is applied for the multiple linear
regression equation, the equation becomes:

Implementation of Multiple Linear Regression model using Python: To

implement MLR using Python, we have below problem: We have a dataset of 6
home prices. This dataset contains four main information: Area, Bedroom, Age
& Price. To create a model that can easily determine which house has a
maximum price, and which is the most affecting factor for the profit of a
house. Since we need to find the Profit, so it is the dependent variable, and the
other three variables are independent variables.

1. Data Pre-processing Steps

2. Fitting the MLR model to the training set
3. Predicting the result of the test set

17
18
19
EXPERIMENT NO: 6

Aim: To write a program using python to implement logistic regression.

Problem Statement: Predicting if a person would buy life insurance based on

his age using logistic regression.

Theory:

Logistic Regression

Logistic regression is a supervised machine learning algorithm used for

classification tasks where the goal is to predict the probability that an
instance belongs to a given class or not.

Logistic regression is used for binary classification where we use sigmoid

function, that takes input as independent variables and produces a
probability value between 0 and 1.

● Logistic regression predicts the output of a categorical dependent

variable. Therefore, the outcome must be a categorical or discrete value.
● It can be either Yes or No, 0 or 1, true or False, etc. but instead of giving
the exact value as 0 and 1, it gives the probabilistic values which lie
between 0 and 1.
● In Logistic regression, instead of fitting a regression line, we fit an “S”
shaped logistic function, which predicts two maximum values (0 or 1).

Logistic Function – Sigmoid Function

The sigmoid function is a mathematical function used to map the predicted

values to probabilities.

It maps any real value into another value within a range of 0 and 1. The value
of the logistic regression must be between 0 and 1, which cannot go beyond
this limit, so it forms a curve like the “S” form.

The S-form curve is called the Sigmoid function or the logistic function.

Types of Logistic Regression

On the basis of the categories, Logistic Regression can be classified into three
types:

Binomial: In binomial Logistic regression, there can be only two possible types
of the dependent variables, such as 0 or 1, Pass or Fail, etc.

20
Multinomial: In multinomial Logistic regression, there can be 3 or more
possible unordered types of the dependent variable, such as “cat”, “dogs”, or
“sheep”

Ordinal: In ordinal Logistic regression, there can be 3 or more possible ordered

types of dependent variables, such as “low”, “Medium”, or “High”.

Program:

In [ ]: import pandas as pd

import numpy as np

from sklearn import linear_model

from google.colab import files

uploaded = files.upload()

Upload widget is only available when the cell has been executed in the current
browser session. Please rerun this cell to enable.

Saving insurance_data.csv to insurance_data.csv

In [ ]: df = pd.read_csv('insurance_data.csv')

Out[ ]:

21
In [ ]: import matplotlib.pyplot as plt

plt.scatter(df.Age,df.Bought_Insurance,marker='+',color='red')

Out[ ]:
<matplotlib.collections.PathCollection at 0x7ab58c242fb0>

In [ ]: from sklearn.model_selection import train_test_split

In [ ]: x_train,x_test,y_train,y_test =
train_test_split(df[['Age']],df.Bought_Insurance,train_size=0.8)

In [ ]: x_test

Out[ ]:

22
In [ ]: x_train

Out[ ]:

23
In [ ]: from sklearn.linear_model import LogisticRegression

model = LogisticRegression()

model.fit(x_train,y_train)

Out[ ]: LogisticRegression()
In a Jupyter environment, please rerun this cell to show the HTML
representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this
page with nbviewer.org.
In [ ]: y_predicted = model.predict(x_test)

In [14]: model.predict_proba(x_test)

Out[14]:

array([[0.04246871, 0.95753129],

[0.19494106, 0.80505894],

[0.04246871, 0.95753129],

[0.97525075, 0.02474925],

[0.96265717, 0.03734283],

[0.13674865, 0.86325135]])

In [15]: from sklearn.metrics import confusion_matrix

from sklearn.metrics import classification_report

Find the results [TP FP FN TN]

In [16]: confusion_matrix(y_test,y_predicted)
Out[16]:

array([[1, 1],

[1, 3]])

In [17]: print("Classification Report")

print(classification_report(y_test,y_predicted))

24
Classification Report

precision recall f1-score support

0 0.50 0.50 0.50 2

1 0.75 0.75 0.75 4

accuracy 0.67 6

macro avg 0.62 0.62 0.62 6

weighted avg 0.67 0.67 0.67 6

25
EXPERIMENT NO: 7

Aim: To implement logistic regression for multi-classs classification.

Theory:
Logistic regression can be extended to handle multiclass classification
problems using several approaches. Unlike binary logistic regression, which
deals with two classes, multiclass classification involves more than two
classes.

Program:

In [ ]: from sklearn.datasets import load_digits

In [ ]: import matplotlib.pyplot as plt

In [ ]: digits=load_digits()

In [ ]: digits.data.shape

Out[ ]: (1797, 64)

In [ ]: plt.gray()

for i in range(5):

plt.matshow(digits.images[i])

<Figure size 640x480 with 0 Axes>

26
27
In [ ]: dir(digits)

Out[ ]: ['DESCR', 'data', 'feature_names', 'frame', 'images', 'target',

'target_names']

28
In [ ]: digits.DESCR[0]

Out[ ]: '.'
In [ ]: digits.data[0]

Out[ ]: array([ 0., 0., 5., 13., 9., 1., 0., 0., 0., 0., 13., 15., 10.,

15., 5., 0., 0., 3., 15., 2., 0., 11., 8., 0., 0., 4.,

12., 0., 0., 8., 8., 0., 0., 5., 8., 0., 0., 9., 8.,

0., 0., 4., 11., 0., 1., 12., 7., 0., 0., 2., 14., 5.,

10., 12., 0., 0., 0., 0., 6., 13., 10., 0., 0., 0.])
In [ ]: digits.target[0]

Out[ ]: 0
In [ ]: digits.target_names[0]

Out[ ]: 0
Create and train logistic regression model.
In [ ]: from sklearn.linear_model import LogisticRegression

model = LogisticRegression()

In [ ]: from sklearn.model_selection import train_test_split

In [ ]: x_train, x_test, y_train, y_test = train_test_split(digits.data,

digits.target, test_size=0.2)

In [ ]: model.fit(x_train, y_train)

/usr/local/lib/python3.10/dist-packages/sklearn/linear_model/_logistic.py:460:
ConvergenceWarning: lbfgs failed to converge (status=1):

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:

https://scikit-learn.org/stable/modules/preprocessing.html

Please also refer to the documentation for alternative solver options:

https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression

29
n_iter_i = _check_optimize_result(

Out[ ]: array([[ 0., 0., 12., ..., 2., 0., 0.],

[ 0., 0., 0., ..., 0., 0., 0.],

[ 0., 0., 0., ..., 11., 1., 0.],

...,

[ 0., 0., 7., ..., 9., 1., 0.],

[ 0., 0., 4., ..., 3., 0., 0.],

[ 0., 0., 10., ..., 12., 4., 0.]])

In [ ]: model.score(x_test, y_test)

Out[ ]: 0.9638888888888889
In [ ]: model.predict(digits.data[0:10])

Out[ ]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [ ]: y_pred = model.predict(x_test)

In [ ]: from sklearn.metrics import confusion_matrix

confusion_matrix(y_test, y_pred)

Out[ ]: array([[39, 0, 0, 0, 0, 1, 0, 0, 0, 0],

[ 0, 38, 0, 0, 0, 0, 0, 0, 0, 0],

[ 0, 1, 35, 0, 0, 0, 0, 0, 0, 0],

[ 0, 0, 0, 36, 0, 1, 0, 1, 2, 0],

[ 0, 0, 0, 0, 29, 0, 0, 0, 0, 0],

[ 0, 1, 0, 0, 0, 30, 0, 0, 0, 0],

[ 0, 0, 0, 0, 0, 0, 36, 0, 0, 0],

30
[ 0, 1, 0, 0, 0, 0, 0, 31, 0, 1],

[ 0, 0, 0, 0, 0, 0, 0, 0, 39, 0],

[ 0, 1, 0, 0, 0, 1, 0, 0, 2, 34]])
In [ ]: from sklearn.metrics import classification_report

print("Classification Report: \n", classification_report(y_test,

y_pred))

Classification Report:

precision recall f1-score support

0 1.00 0.97 0.99 40

1 0.90 1.00 0.95 38

2 1.00 0.97 0.99 36

3 1.00 0.90 0.95 40

4 1.00 1.00 1.00 29

5 0.91 0.97 0.94 31

6 1.00 1.00 1.00 36

7 0.97 0.94 0.95 33

8 0.91 1.00 0.95 39

9 0.97 0.89 0.93 38

accuracy 0.96 360

macro avg 0.97 0.96 0.96 360

weighted avg 0.97 0.96 0.96 360

31
EXPERIMENT NO: 8

Aim: To study Naive Bayes using Machine Learning

Theory:

Naive Bayes is a simple yet powerful algorithm for classification based on

Bayes' Theorem. It assumes that the features in the dataset are independent of
each other, hence the term "naive." Despite this simplification, it often
performs well in many real-world applications, particularly in text
classification problems.

Bayes' Theorem
The foundation of Naive Bayes lies in Bayes' Theorem, which helps calculate
the probability of a hypothesis (label) given some evidence (features). The
formula for Bayes' Theorem is:

[ P(H|E) = (P(E|H) * P(H)) / P(E) ]

Where:

● P(H|E) is the posterior probability, the probability of the hypothesis

(class) (H) being true given the evidence (features) (E).
● P(E|H) is the likelihood, the probability of observing the evidence (E)
given the hypothesis (H).
● P(H) is the prior probability of the hypothesis, representing how
common the hypothesis is.
● P(E) is the marginal likelihood, the total probability of the evidence.

The Naive Assumption

The algorithm assumes that all features are independent of each other. This
simplifies the calculations, as the joint probability (P(E|H)) can be broken
down into the product of individual probabilities:

[ P(E|H) = P(e_1|H) \cdot P(e_2|H) \cdot \ldots \cdot P(e_n|H) ]

This independence assumption is rarely true in practice, but Naive Bayes still
works well in many cases due to its simplicity and efficiency.

Types of Naive Bayes Classifiers

1. Gaussian Naive Bayes: Assumes that the features follow a normal

(Gaussian) distribution, often used when dealing with continuous data.
2. Multinomial Naive Bayes: Works well for discrete data, especially for
text classification where the features are word counts or frequencies.

32
3. Bernoulli Naive Bayes: Suitable for binary/boolean data, often used
when the features are represented as binary values (e.g., the presence or
absence of a word in text classification).

Applications
Naive Bayes is commonly used in:

● Spam filtering: Classifying emails as spam or not based on the

occurrence of specific words.
● Sentiment analysis: Determining whether a given text has a positive or
negative sentiment.
● Document classification: Categorizing documents into predefined
categories.

Program:
In [4]: from sklearn.naive_bayes import GaussianNB

from sklearn.naive_bayes import MultinomialNB

from sklearn import datasets

from sklearn.metrics import confusion_matrix

from sklearn.model_selection import train_test_split

iris = datasets.load_iris()

x = iris.data

y = iris.target

x_train, x_test, y_train, y_test = train_test_split(x, y,

test_size=0.3, random_state=0)

gnb = GaussianNB()

mnb = MultinomialNB()

y_pred_gnb = gnb.fit(x_train, y_train).predict(x_test)

cnf_matrix_gnb = confusion_matrix(y_test, y_pred_gnb)

cnf_matrix_gnb

Out[4]: array([[16, 0, 0],

[ 0, 18, 0],

33
[ 0, 0, 11]])
In [5]: from sklearn.metrics import classification_report

print(classification_report(y_test, y_pred_gnb))

In [6]: ans = gnb.predict([[5, 3, 1.2, 2]])

ans

Out[6]: array([1])
In [7]: from sklearn.datasets import load_iris

import pandas as pd

iris = load_iris()

df = pd.DataFrame(iris.data, columns=iris.feature_names)

df['target'] = iris.target

X = iris.data

df.sample(4)

34
In [9]: df['species'] = pd.Categorical.from_codes(iris.target,
iris.target_names)
df.head()

In [10]: df['species'] = pd.Categorical.from_codes(iris.target,

iris.target_names)

df.tail()

35
EXPERIMENT NO:9

Aim: Introduction and study of K means algorithm

Theory:

Unsupervised Machine Learning is the process of teaching a computer to use

unlabeled, unclassified data and enabling the algorithm to operate on that
data without supervision. Without any previous data training, the machine’s
job in this case is to organize unsorted data according to parallels, patterns,
and variations.

K means clustering, assigns data points to one of the K clusters depending on

their distance from the center of the clusters. It starts by randomly assigning
the clusters centroid in the space. Then each data point assign to one of the
cluster based on its distance from centroid of the cluster. After assigning each
point to one of the cluster, new cluster centroids are assigned. This process
runs iteratively until it finds good cluster. In the analysis we assume that
number of cluster is given in advanced and we have to put points in one of the
group.

In some cases, K is not clearly defined, and we have to think about the optimal
number of K. K Means clustering performs best data is well separated. When
data points overlapped this clustering is not suitable. K Means is faster as
compare to other clustering technique. It provides strong coupling between
the data points. K Means cluster do not provide clear information regarding
the quality of clusters. Different initial assignment of cluster centroid may
lead to different clusters. Also, K Means algorithm is sensitive to noise. It may
have stuck in local minima.

What is the objective of k-means clustering?

The goal of clustering is to divide the population or set of data points into a
number of groups so that the data points within each group are more
comparable to one another and different from the data points within the other
groups. It is essentially a grouping of things based on how similar and
different they are to one another.

How k-means clustering works?

We are given a data set of items, with certain features, and values for these
features (like a vector). The task is to categorize those items into groups. To
achieve this, we will use the K-means algorithm, an unsupervised learning
algorithm. ‘K’ in the name of the algorithm represents the number of
groups/clusters we want to classify our items into.
(It will help if you think of items as points in an n-dimensional space). The

36
algorithm will categorize the items into k groups or clusters of similarity. To
calculate that similarity, we will use the Euclidean distance as a measurement.
The algorithm works as follows:

First, we randomly initialize k points, called means or cluster centroids. We

categorize each item to its closest mean, and we update the mean’s
coordinates, which are the averages of the items categorized in that cluster so
far.
We repeat the process for a given number of iterations and at the end, we have
our clusters.

Program:
In [ ]: import matplotlib. pyplot as plt
import numpy as np
from sklearn .cluster import KMeans
x = np.array([[5,3], [10,15], [15,12], [24,10], [30,45],
[85,70], [71,80], [60,78], [55,52], [80,91]])
kmeans = KMeans(n_clusters=2, random_state=0).fit(x)
print(kmeans.labels_)
print(kmeans.cluster_centers_)
plt.scatter(x[:,0], x[:,1], label = 'trueposition')

[1 1 1 1 1 0 0 0 0 0] [[70.2 74.2] [16.8 17. ]]

Out[ ]:
<matplotlib.collections.PathCollection at 0x78428cb42080>

37
In [ ]: kmeans = KMeans(n_clusters=2)
kmeans.fit(x)
print(kmeans.cluster_centers_)

[[70.2 74.2]
[16.8 17. ]]

In [ ]: print(kmeans.labels_)

Out[ ]: [1 1 1 1 1 0 0 0 0 0]

In [ ]: plt.scatter(x[:,0], x[:,1], c=kmeans.labels_,

cmap='rainbow')

Out[ ]:
<matplotlib.collections.PathCollection at 0x7842876e5e70>

In [ ]: plt.scatter(x[:,0], x[:,1], c=kmeans.labels_,

cmap='rainbow')
plt.scatter(kmeans.cluster_centers_[:,0],
kmeans.cluster_centers_[:,1], color='black')

Out[ ]:
<matplotlib.collections.PathCollection at 0x784287a62d10>

38
In [2]: import matplotlib.pyplot as plt
import numpy as np
from sklearn.cluster import KMeans
from sklearn import datasets
iris = datasets.load_iris()
x = iris.data
plt.scatter(x[:,0], x[:,1], label = 'TruePosition')

Out[2]:
<matplotlib.collections.PathCollection at 0x7b286e9dd720>

39
In [3]: kmeans = KMeans(n_clusters=2)
kmeans.fit(x)
print(kmeans.cluster_centers_)
Out[ ]: [[6.30103093 2.88659794 4.95876289 1.69587629]
[5.00566038 3.36981132 1.56037736 0.29056604]]

In [5]: print(kmeans.labels_)

Out[ ]:
[1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1
1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0
0 0]

In [6]: plt.scatter(x[:,0], x[:,1], c=kmeans.labels_,

cmap='rainbow')

Out[6]:
<matplotlib.collections.PathCollection at 0x7b2868a699f0>

40
In [7]: kmeans = KMeans(n_clusters=3)
kmeans.fit(x)
print(kmeans.cluster_centers_)
Out[ ]: [[6.85384615 3.07692308 5.71538462 2.05384615]
[5.006 3.428 1.462 0.246 ]
[5.88360656 2.74098361 4.38852459 1.43442623]]

In [8]: print(kmeans.labels_)

[1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1
1 1 1 1 1 1 1 1 1 1 1 1 1 0 2 0 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2
2 2 2 0 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 0 2 0 0 0 0 2 0 0 0
0
0 0 2 2 0 0 0 0 2 0 2 0 2 0 0 2 2 0 0 0 0 0 2 0 0 0 0 2 0 0 0 2 0 0 0 2
0
0 2]

41
EXPERIMENT NO 10:

Aim: To study and perform Principal Component Analysis

Theory:
As the number of features or dimensions in a dataset increases, the amount of
data required to obtain a statistically significant result increases
exponentially. This can lead to issues such as overfitting, increased
computation time, and reduced accuracy of machine learning models. This is
known as the curse of dimensionality problems that arise while working with
high-dimensional data.

As the number of dimensions increases, the number of possible combinations

of features increases exponentially, which makes it computationally difficult
to obtain a representative sample of the data. It becomes expensive to perform
tasks such as clustering or classification because the algorithms need to
process a much larger feature space, which increases computation time and
complexity. Additionally, some machine learning algorithms can be sensitive
to the number of dimensions, requiring more data to achieve the same level of
accuracy as lower-dimensional data.

To address the curse of dimensionality, Feature engineering techniques are

used which include feature selection and feature extraction. Dimensionality
reduction is a type of feature extraction technique that aims to reduce the
number of input features while retaining as much of the original information
as possible.

In this article, we will discuss one of the most popular dimensionality

reduction techniques i.e. Principal Component Analysis(PCA).

Program:

Step 1: Import the necessary libraries

In [17]: import numpy as np
from numpy import linalg as la

Step 2: Give the input dataset.

In [2]: x = np.array([2.5, 0.5, 2.2, 1.9, 3.1, 2.3, 2, 1, 1.5, 1.1])
y = np.array([2.4, 0.7, 2.9, 2.2, 3, 2.7, 1.6, 1.1, 1.6, 0.9])
data = np.array([x, y])
print(x)
print(y)

42
print(data)

[2.5 0.5 2.2 1.9 3.1 2.3 2. 1. 1.5 1.1]

[2.4 0.7 2.9 2.2 3. 2.7 1.6 1.1 1.6 0.9]
[[2.5 0.5 2.2 1.9 3.1 2.3 2. 1. 1.5 1.1]
[2.4 0.7 2.9 2.2 3. 2.7 1.6 1.1 1.6 0.9]]

In [3]: xMean = np.mean(x)

yMean = np.mean(y)
print(xMean)
print(yMean)

Out [3]: 1.81

1.9100000000000001

In [4]: data.shape

Out[4]: (2, 10)

Step 3: Compute the mean adjusted values by subtracting each point from its mean.
In [5]: meanAdjusted = np.zeros((2, 10))
for i in range(len(data[0])):
meanAdjusted[0][i] = data[0][i] - xMean
for i in range(len(data[1])):
meanAdjusted[1][i] = data[1][i] - yMean
print(meanAdjusted)

Out [5]:[[ 0.69 -1.31 0.39 0.09 1.29 0.49 0.19 -0.81 -0.31 -0.71]
[ 0.49 -1.21 0.99 0.29 1.09 0.79 -0.31 -0.81 -0.31 -1.01]]

Step 4: Compute the covariance matrix of the mean adjusted data

In [6]: cov_mat = np.cov(meanAdjusted)
print(cov_mat)

Out [6]: [[0.61655556 0.61544444]

[0.61544444 0.71655556]]

Step 5: Compute the eigen values and eigen vectors.

In [7]: eig_vals, eig_vecs = np.linalg.eig(cov_mat)
print('Eigenvectors \n%s' %eig_vecs)
print('\nEigenvalues \n%s' %eig_vals)

43
Out [7]: Eigenvectors
[[-0.73517866 -0.6778734 ]
[ 0.6778734 -0.73517866]]
Eigenvalues
[0.0490834 1.28402771]

Step 6: Arrange the eigenvalues in descending order.

In [8]: eig_pairs = [(np.abs(eig_vals[i]), eig_vecs[:,i]) for i in
range(len(eig_vals))]
eig_pairs.sort()
eig_pairs.reverse()
print('Eigenvalues in descending order:')
for i in eig_pairs:
print(i[0])

Out [8]: Eigenvalues in descending order:

1.2840277121727839
0.04908339893832736

In [10]: print('Eigenvectors in descending order: ')

for i in eig_pairs:
print(i[1])

Out [10]: Eigenvectors in descending order:

[-0.6778734 -0.73517866]
[-0.73517866 0.6778734 ]

In [11]: eig_pairs [0][1]

Out[11]: array([-0.6778734 , -0.73517866])

Step 7: Retaining only those eigenvalues having maximum values, tranform data and display.

In [12]: transformedData1 = np.matmul (meanAdjusted.T, eig_pairs[0][1])

transformedData2 = np.matmul (meanAdjusted.T, eig_pairs[1][1])
print(transformedData1)
print(transformedData2)

Out [12]: [-0.82797019 1.77758033 -0.99219749 -0.27421042 -1.67580142

-0.9129491 0.09910944 1.14457216 0.43804614 1.22382056]
[-0.17511531 0.14285723 0.38437499 0.13041721 -0.20949846
0.17528244 -0.3498247 0.04641726 0.01776463 -0.16267529]

44
In [13]: transformedData = [transformedData1, transformedData2]
transformedData = np.transpose(transformedData)
print(transformedData)

Out [13]:
[[-0.82797019 -0.17511531]
[ 1.77758033 0.14285723]
[-0.99219749 0.38437499]
[-0.27421042 0.13041721]
[-1.67580142 -0.20949846]
[-0.9129491 0.17528244]
[ 0.09910944 -0.3498247 ]
[ 1.14457216 0.04641726]
[ 0.43804614 0.01776463]
[ 1.22382056 -0.16267529]]

In [14]: matrix_w = np.hstack((eig_pairs[0][1].reshape(2,1),

eig_pairs[1][1].reshape(2,1)))
print('Matrix W:\n', matrix_w)

Out [14]: Matrix W:

[[-0.6778734 -0.73517866]
[-0.73517866 0.6778734 ]]

Step 8: Reconstruct and transform the original data.

In [16]: originalData = np.matmul(transformedData, matrix_w)

originalData[:][:] = originalData[:][:] + np.array([xMean, yMean])
print(originalData)

Out [16]:
[[2.5 2.4]
[0.5 0.7]
[2.2 2.9]
[1.9 2.2]
[3.1 3. ]
[2.3 2.7]
[2. 1.6]
[1. 1.1]
[1.5 1.6]
[1.1 0.9]]

45
EXPERIMENT NO: 11

In [1]: import numpy as np

import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
import pylab as pl

In [7]: x1 = np.arange(0, 10)

y1 = np.arange(10, 0, -1)

In [8]: plt.plot(x1,y1)

Out[8]:
[<matplotlib.lines.Line2D at 0x7f1efc951060>]

In [9]: np.cov([x1,y1])

Out[9]:
array([[ 9.16666667, -9.16666667],
[-9.16666667, 9.16666667]])

In [10]: x2 = np.arange(0,10)

46
y2 = np.array([2]*10)
plt.plot(x2,y2)

Out[10]:
[<matplotlib.lines.Line2D at 0x7f1efc7694b0>]

In [11]: cov_mat = np.cov([x2,y2])

cov_mat
Out[11]:
array([[9.16666667, 0. ],
[0. , 0. ]])

In [12]: x3 = np.array([2]*10)
y3 = np.arange(0,10)
plt.plot(x3,y3)

Out[12]:
[<matplotlib.lines.Line2D at 0x7f1efc5d7640>]

47
In [13]: np.cov([x3,y3])

Out[13]:
array([[0. , 0. ],
[0. , 9.16666667]])

In [14]: iris = load_iris()

In [15]: iris_df = pd.DataFrame(iris.data,columns=[iris.feature_names])

iris_df.head()

Out[15]:

48
In [16]: X = iris.data
X.shape

Out[16]: (150, 4)

In [18]: from sklearn.preprocessing import StandardScaler

X_std = StandardScaler().fit_transform(X)
print(X_std[0:5])
print("The shape of Feature Matrix is -",X_std.shape)
Out[18]:

[[-0.90068117 1.01900435 -1.34022653 -1.3154443 ]

[-1.14301691 -0.13197948 -1.34022653 -1.3154443 ]
[-1.38535265 0.32841405 -1.39706395 -1.3154443 ]
[-1.50652052 0.09821729 -1.2833891 -1.3154443 ]
[-1.02184904 1.24920112 -1.34022653 -1.3154443 ]]
The shape of Feature Matrix is - (150, 4)

In [19]: X_covariance_matrix = np.cov(X_std.T)

X_covariance_matrix

Out[19]:
array([[ 1.00671141, -0.11835884, 0.87760447, 0.82343066],
[-0.11835884, 1.00671141, -0.43131554, -0.36858315],
[ 0.87760447, -0.43131554, 1.00671141, 0.96932762],
[ 0.82343066, -0.36858315, 0.96932762, 1.00671141]])

In [20]: eig_vals, eig_vecs = np.linalg.eig(X_covariance_matrix)

print('Eigenvectors \n%s' %eig_vecs)
print('\nEigenvalues \n%s' %eig_vals)

Eigenvectors
[[ 0.52106591 -0.37741762 -0.71956635 0.26128628]
[-0.26934744 -0.92329566 0.24438178 -0.12350962]
[ 0.5804131 -0.02449161 0.14212637 -0.80144925]
[ 0.56485654 -0.06694199 0.63427274 0.52359713]]

Eigenvalues
[2.93808505 0.9201649 0.14774182 0.02085386]

In [21]: eig_pairs = [(np.abs(eig_vals[i]), eig_vecs[:,i]) for i in

range(len(eig_vals))]

# Sort the (eigenvalue, eigenvector) tuples from high to low

eig_pairs.sort(key=lambda x: x[0], reverse=True)

49
print('Eigenvalues in descending order:')
for i in eig_pairs:
print(i[0])

Out[9]:
Eigenvalues in descending order:
2.938085050199995
0.9201649041624864
0.1477418210449475
0.020853862176462696

In [23]: tot = sum(eig_vals)

var_exp = [(i / tot)*100 for i in sorted(eig_vals, reverse=True)]
cum_var_exp = np.cumsum(var_exp)
print("Variance captured by each component is \n",var_exp)
print(40 * '-')
print("Cumulative variance captured as we travel each component
\n",cum_var_exp)

Variance captured by each component is

[72.96244541329989, 22.850761786701753, 3.668921889282865, 0.5178709107154905]
----------------------------------------
Cumulative variance captured as we travel each component
[ 72.96244541 95.8132072 99.48212909 100. ]

In [24]: matrix_w = np.hstack((eig_pairs[0][1].reshape(4,1),

eig_pairs[1][1].reshape(4,1)))
print ('Matrix W:\n', matrix_w)

Out[24]:
Matrix W:
[[ 0.52106591 -0.37741762]
[-0.26934744 -0.92329566]
[ 0.5804131 -0.02449161]
[ 0.56485654 -0.06694199]]

In [25]: Y = X_std.dot(matrix_w)
print (Y[0:5])

Out[25]:
[[-2.26470281 -0.4800266 ]
[-2.08096115 0.67413356]
[-2.36422905 0.34190802]
[-2.29938422 0.59739451]
[-2.38984217 -0.64683538]]

50
In [28]: pl.figure()
target_names = iris.target_names
y = iris.target
for c, i, target_name in zip("rgb", [0, 1, 2], target_names):
pl.scatter(Y[y==i,0], Y[y==i,1], c=c, label=target_name)
pl.xlabel('Principal Component 1')
pl.ylabel('Principal Component 2')
pl.legend()
pl.title('PCA of IRIS dataset')
pl.show()

Out[28]:

Association Between Screen Time and Children's Performance On A Developmental Screening Test
100% (2)
Association Between Screen Time and Children's Performance On A Developmental Screening Test
7 pages
Kurdyak Paul A 200903 PHD Thesis
No ratings yet
Kurdyak Paul A 200903 PHD Thesis
135 pages
Bank Loan Case Study Report
No ratings yet
Bank Loan Case Study Report
23 pages
Bnstruct: An R Package For Bayesian Network Structure Learning With Missing Data
No ratings yet
Bnstruct: An R Package For Bayesian Network Structure Learning With Missing Data
26 pages
13 - Chapter 4 PDF
No ratings yet
13 - Chapter 4 PDF
46 pages
Research Analysis PDF
No ratings yet
Research Analysis PDF
51 pages
3B. Kuantitatif - Data Preparation (Malhotra 14)
No ratings yet
3B. Kuantitatif - Data Preparation (Malhotra 14)
28 pages
Effects of Gamification On Students' English Language Proficiency
No ratings yet
Effects of Gamification On Students' English Language Proficiency
19 pages
Unit 3
No ratings yet
Unit 3
102 pages
About The STS Risk Calculator v2 81
No ratings yet
About The STS Risk Calculator v2 81
4 pages
The Internal Structure of Language Learning Motivation and Its Relationship With Language Choice and Learning Effort
No ratings yet
The Internal Structure of Language Learning Motivation and Its Relationship With Language Choice and Learning Effort
18 pages
Big Data Analysis
No ratings yet
Big Data Analysis
38 pages
Ai - DT
No ratings yet
Ai - DT
52 pages
IP Practical Record 2022-23
No ratings yet
IP Practical Record 2022-23
43 pages
D P Lab Manual
No ratings yet
D P Lab Manual
54 pages
DSBDA Sample Problem Statements
No ratings yet
DSBDA Sample Problem Statements
3 pages
Fds Mannual
No ratings yet
Fds Mannual
39 pages
Trenddetection Birds JOrnithol
No ratings yet
Trenddetection Birds JOrnithol
8 pages
PS5 Answer Key
No ratings yet
PS5 Answer Key
6 pages
Topic 1 - An Introduction To Panel Data Analysis
No ratings yet
Topic 1 - An Introduction To Panel Data Analysis
37 pages
CS3362 Data Science Laboratory Manual 2022-23
No ratings yet
CS3362 Data Science Laboratory Manual 2022-23
54 pages
Building Good Training Sets UNIT 1 PART2
No ratings yet
Building Good Training Sets UNIT 1 PART2
46 pages
Machine Learning - Manual
No ratings yet
Machine Learning - Manual
32 pages
ML Lab Manual Final
No ratings yet
ML Lab Manual Final
36 pages
Advanced Python Lab
No ratings yet
Advanced Python Lab
17 pages
Internship
No ratings yet
Internship
31 pages
AIML 01 Merged
No ratings yet
AIML 01 Merged
25 pages
Inequalities by Sex On Remaining Teeth in Adults - A Decomposition Analysis
No ratings yet
Inequalities by Sex On Remaining Teeth in Adults - A Decomposition Analysis
7 pages
Assignment 2 - Data Management
No ratings yet
Assignment 2 - Data Management
68 pages
QB - 22ADS35 (Python For Data Science)
No ratings yet
QB - 22ADS35 (Python For Data Science)
6 pages
Fundamentals of Data Science Students
No ratings yet
Fundamentals of Data Science Students
52 pages
Experiment 3 ML
No ratings yet
Experiment 3 ML
4 pages
Lesson 07 Data Manipulation With Pandas
No ratings yet
Lesson 07 Data Manipulation With Pandas
82 pages
Dwdm-Lab Manual
No ratings yet
Dwdm-Lab Manual
39 pages
Decap776 P 1
No ratings yet
Decap776 P 1
6 pages
MLT Lab Manual
No ratings yet
MLT Lab Manual
41 pages
Data Science
No ratings yet
Data Science
42 pages
Data Analysis Lab - Final - 23-24
No ratings yet
Data Analysis Lab - Final - 23-24
11 pages
Business Problem Statement
No ratings yet
Business Problem Statement
20 pages
PW2 DataCleaning
No ratings yet
PW2 DataCleaning
6 pages
Kumar Firake (226213110) Python Projectppt
No ratings yet
Kumar Firake (226213110) Python Projectppt
14 pages
CS3361 - Data Science Laboratory
No ratings yet
CS3361 - Data Science Laboratory
31 pages
ML Prep For Samsung
No ratings yet
ML Prep For Samsung
73 pages
LastMinuteRevisionMaterial IP24 25 82423
No ratings yet
LastMinuteRevisionMaterial IP24 25 82423
18 pages
ML File Updated
No ratings yet
ML File Updated
60 pages
3rd Semester DDM AI DAA DEV Print Pages For Spiral Record 25-1-24 - Removed
No ratings yet
3rd Semester DDM AI DAA DEV Print Pages For Spiral Record 25-1-24 - Removed
28 pages
Cracking The Code Unleashing The Power of Sentiment Analysis - ML For Moroccan Stock Market Forecasting
No ratings yet
Cracking The Code Unleashing The Power of Sentiment Analysis - ML For Moroccan Stock Market Forecasting
5 pages
Addiction - 2025 - McGrane - What Is The Impact of Sports Related Gambling Advertising On Gambling Behaviour A Systematic
No ratings yet
Addiction - 2025 - McGrane - What Is The Impact of Sports Related Gambling Advertising On Gambling Behaviour A Systematic
19 pages
Exp No. 1-3 (MLC)
No ratings yet
Exp No. 1-3 (MLC)
12 pages
ML Aml Cse It Lab Manual Final
No ratings yet
ML Aml Cse It Lab Manual Final
22 pages
Literature Survey On AI-Driven Early Sepsis Prediction Using Clinical Data
No ratings yet
Literature Survey On AI-Driven Early Sepsis Prediction Using Clinical Data
42 pages
Rubin 1976
No ratings yet
Rubin 1976
12 pages
Paper - Advanced Bioinformatics Methods For Practical Applications in Proteomics
No ratings yet
Paper - Advanced Bioinformatics Methods For Practical Applications in Proteomics
17 pages
ML Contenthalf
No ratings yet
ML Contenthalf
35 pages
01 Introduction To Python
No ratings yet
01 Introduction To Python
36 pages
Protective Factors Angels in The Nursery
No ratings yet
Protective Factors Angels in The Nursery
15 pages
Jal1603 Tsaf Unit-1
No ratings yet
Jal1603 Tsaf Unit-1
26 pages
Python Interviews
No ratings yet
Python Interviews
154 pages
Final Dev Record
No ratings yet
Final Dev Record
49 pages
Python Basics Refresher
No ratings yet
Python Basics Refresher
19 pages
DP Prog
No ratings yet
DP Prog
10 pages
01 Introduction To Python
No ratings yet
01 Introduction To Python
36 pages
Ilovepdf Merged (2) Merged
No ratings yet
Ilovepdf Merged (2) Merged
65 pages
Dsa Lab Manual
No ratings yet
Dsa Lab Manual
72 pages
Learninng Plan
No ratings yet
Learninng Plan
6 pages
Prac3 AAM
No ratings yet
Prac3 AAM
2 pages
Index of Data Science
No ratings yet
Index of Data Science
1 page
AD3301 DEV Lab Manual
No ratings yet
AD3301 DEV Lab Manual
26 pages
Fdsa Lab Manual Final
No ratings yet
Fdsa Lab Manual Final
70 pages
EXP1-siddhant Gupta (23 - SE - 148)
No ratings yet
EXP1-siddhant Gupta (23 - SE - 148)
17 pages
Ids 1
No ratings yet
Ids 1
30 pages
Aanik Info Practical 3261
No ratings yet
Aanik Info Practical 3261
61 pages
ML Lab Manual
No ratings yet
ML Lab Manual
59 pages
Data Analytics Lab Manual
No ratings yet
Data Analytics Lab Manual
47 pages
Fdsa Lab Manual
No ratings yet
Fdsa Lab Manual
53 pages
FDS Record-1-4
No ratings yet
FDS Record-1-4
18 pages
ML (Sudhanshu)
No ratings yet
ML (Sudhanshu)
24 pages
DS Final
No ratings yet
DS Final
46 pages
Quant Table 1
No ratings yet
Quant Table 1
3 pages
DA Lab
No ratings yet
DA Lab
27 pages
Python ClassXII AI
No ratings yet
Python ClassXII AI
4 pages
Fds Merged
No ratings yet
Fds Merged
102 pages
ML File Syllabus
No ratings yet
ML File Syllabus
43 pages
Da Program Upto 6
No ratings yet
Da Program Upto 6
20 pages
FDS Lab
No ratings yet
FDS Lab
43 pages
Python Lab PRG
No ratings yet
Python Lab PRG
20 pages
Foundation of Data Science Lab Manual Full
No ratings yet
Foundation of Data Science Lab Manual Full
8 pages
Inbound 7082907700606943605
No ratings yet
Inbound 7082907700606943605
120 pages
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
From Everand
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
e3
No ratings yet
Data Science Programming In Python
From Everand
Data Science Programming In Python
Anita Raichand
No ratings yet