0% found this document useful (0 votes)

64 views29 pages

Principal Component Analysis With Python (A Deep Dive) - by Francesco Franco - Oct, 2024 - Medium

The document provides an in-depth exploration of Principal Component Analysis (PCA) in Python, emphasizing its role in dimensionality reduction for machine learning models. It outlines the iterative process of training supervised models, the challenges posed by the curse of dimensionality, and differentiates between feature selection and feature extraction techniques. The article also details the steps involved in PCA, including computing covariance matrices and using libraries like Scikit-learn for practical implementation.

Uploaded by

Fulvio Roveta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

64 views29 pages

Principal Component Analysis With Python (A Deep Dive) - by Francesco Franco - Oct, 2024 - Medium

Uploaded by

Fulvio Roveta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

10/28/24, 9:39 AM Principal Component Analysis with Python (A Deep Dive) | by Francesco Franco | Oct, 2024 | Medium

Open in app

Principal Component Analysis with Python (A

Deep Dive)
Francesco Franco · Follow
23 min read · 1 day ago

Listen Share More

T raining a Supervised Machine Learning model — whether it is a traditional

one or a Deep Learning model — involves a few steps. The first is feeding
forward the data through the model, generating predictions. The second is
comparing those predictions with the actual values, which are also called ground
truth. The third is to optimize the model based on the minimization of some
objective function.

https://medium.com/@francescofranco_39234/principal-component-analysis-with-python-a-deep-dive-0c5195bff087#id_token=eyJhbGciOiJSUzI… 1/36
10/28/24, 9:39 AM Principal Component Analysis with Python (A Deep Dive) | by Francesco Franco | Oct, 2024 | Medium

In this iterative process, the model gets better and better, and sometimes it even
gets really good.

But what data will you feed forward?

Sometimes, your input sample will contain many columns, also known as features. It
is common knowledge (especially with traditional models) that using every column
in your Machine Learning model will mean trouble, the so-called curse of
dimensionality. In this case, you’ll have to selectively handle the features you are
working with. In this article, we’ll cover Principal Components Analysis (PCA),
which is one such way. It provides a gentle but extensive introduction to feature
extraction for your Machine Learning model with PCA.

The article is structured as follows. First of all, we’ll take a look at what PCA is. We
do this through the lens of the Curse of Dimensionality, which explains why we need
to reduce dimensionality especially with traditional Machine Learning algorithms.
This also involves the explanation of the differences between Feature Selection and
Feature Extraction techniques, which have different goals. PCA, which is part of the
Feature Extraction branch of techniques, is then introduced.

When we have an adequate understanding of PCA conceptually, we’ll take a look at

it from a Python point of view. For a sample dataset, we’re going to perform PCA in a
step-by-step fashion. We’ll take a look at all the individual components. Firstly, we’ll
compute the covariance matrix for the variables. Then, we compute the
eigenvectors and eigenvalues, and select which ones are best. Subsequently, we
compose the PCA projection matrix for mapping the data onto the axes of the
principal components. This allows us to create entirely new dimensions which
capture most of the variance from the original dataset, at a fraction of the
dimensions. Note that SVD can also be used instead of eigenvector decomposition;
we’ll also take a look at that.

Once we clearly understand how PCA happens by means of the Python example,
we’ll show you how you don’t have to reinvent the wheel if you’re using PCA. If you
understand what’s going on, it’s often better to use a well-established library for
computing the PCA. Using Scikit-learn’s sklearn.decomposition.PCA API, we will
finally show you how to compute principal components and apply them to perform
dimensionality reduction for your dataset.

All right. Enough introduction. Let’s get to work!

https://medium.com/@francescofranco_39234/principal-component-analysis-with-python-a-deep-dive-0c5195bff087#id_token=eyJhbGciOiJSUzI… 2/36
10/28/24, 9:39 AM Principal Component Analysis with Python (A Deep Dive) | by Francesco Franco | Oct, 2024 | Medium

What is Principal Component Analysis?

Before we dive in to the specifics of PCA, I think we should first take a look at why it
can be really useful for Machine Learning projects. For this reason, we will first take
a look at Machine Learning projects and the Curse of Dimensionality, which is
especially present when using older Machine Learning algorithms (Support Vector
Machines, Logistic Regression, …).

Then, we’ll discuss what can be done against it — dimensionality reduction — and
explain the difference between Feature Selection and Feature Extraction. Finally,
we’ll get to PCA — and provide a high-level introduction.

Machine Learning and the Curse of Dimensionality

If you are training a Supervised Machine Learning model, at a high level, you are
following a three-step, iterative process:

Since Supervised Learning means that you have a dataset at your disposal, the first
step in training a model is feeding the samples to the model. For every sample, a
prediction is generated. Note that at the first iteration, the model has just been
initialized. The predictions therefore likely make no sense at all.

This becomes especially evident from what happens in the second step, where
predictions and ground truth (= actual targets) are compared. This comparison
produces an error of loss value which illustrates how bad the model performs.

The third step is then really simple: you improve the model. Depending on the
Machine Learning algorithm, optimization happens in different ways. In the case of
Neural networks, gradients are computed with backpropagation, and subsequently
optimizers (topic for a future blog) are used for changing the model internals.
https://medium.com/@francescofranco_39234/principal-component-analysis-with-python-a-deep-dive-0c5195bff087#id_token=eyJhbGciOiJSUzI… 3/36
10/28/24, 9:39 AM Principal Component Analysis with Python (A Deep Dive) | by Francesco Franco | Oct, 2024 | Medium

Weights can also be changed by minimizing one function only; it just depends on
the algorithm.

You then start again. Likely, because you have optimized the model, the predictions
are a little bit better now. You simply keep iterating until you are satisfied with the
results, and then you stop the training process.

Underfitting and overfitting a model

When you are performing this iterative process, you are effectively moving from a
model that is underfit to a model that demonstrates a good fit. Let’s briefly take a look
at them here.

In the first stages of the training process, your model is likely not able to capture the
patterns in your dataset. This is visible in the top figure below. The solution is
simple: just keep training until you achieve the right fit for the dataset (that’s the
bottom figure). Now, you can’t keep training forever. If you do, the model will learn to
focus too much on patterns hidden within your training dataset — patterns that may
not be present in other real-world data at all; patterns truly specific to the sample
with which you are training.

The result: a model tailored to your specific dataset, visible in the second figure.

In other words, training a Machine Learning model involves finding a good balance
between a model that is underfit and a model that is overfit. Fortunately, many
techniques are available to help you with this, but it’s one of the most common
problems in Supervised ML today.

https://medium.com/@francescofranco_39234/principal-component-analysis-with-python-a-deep-dive-0c5195bff087#id_token=eyJhbGciOiJSUzI… 4/36
10/28/24, 9:39 AM Principal Component Analysis with Python (A Deep Dive) | by Francesco Franco | Oct, 2024 | Medium

https://medium.com/@francescofranco_39234/principal-component-analysis-with-python-a-deep-dive-0c5195bff087#id_token=eyJhbGciOiJSUzI… 5/36
10/28/24, 9:39 AM Principal Component Analysis with Python (A Deep Dive) | by Francesco Franco | Oct, 2024 | Medium

On the top, a model that is underfit with respect to the data. In the middle: a model that is overfit with respect
to the data. On the bottom: the fit that we were looking for.

Having a higher-dimensional feature vector

I think the odds are that I can read your mind at this point.

Overfitting, underfitting, and training a Machine Learning model — how are they
related to Principal Component Analysis?

That’s a fair question. What I want to do is to illustrate why a large dataset — in

terms of the number of columns — can significantly increase the odds that your
model will overfit.

Suppose that you have the following feature vector:

f{x} = [1.23, -3.00, 45.2, 9.3, 0.1, 12.3, 8.999, 1.02, -2.45, -0.26, 1.24]

This feature vector is 11-dimensional.

Now suppose that you have 200 samples.

https://medium.com/@francescofranco_39234/principal-component-analysis-with-python-a-deep-dive-0c5195bff087#id_token=eyJhbGciOiJSUzI… 6/36
10/28/24, 9:39 AM Principal Component Analysis with Python (A Deep Dive) | by Francesco Franco | Oct, 2024 | Medium

Will a Machine Learning model be able to generalize across all eleven dimensions?
In other words, do we have sufficient samples to cover large parts of the domains
for all features in the vector (i.e., all the axes in the 11-dimensional space)? Or does
it look like Swiss cheese with (massive) holes?

I think it’s the latter. Welcome to the Curse of Dimensionality.

The Curse of Dimensionality

Quoting from Wikipedia:

In machine learning problems that involve learning a “state-of-nature” from a finite

number of data samples in a high-dimensional feature space with each feature having a
range of possible values, typically an enormous amount of training data is required to
ensure that there are several samples with each combination of values.

Wikipedia (n.d.)

That’s what we just described in different words.

The point with “ensuring that there are several samples with each combination of
values” is that when this is performed well, you will likely be able to train a model
that (1) performs well and (2) generalizes well across many settings. With 200
samples, however, it’s 100% certain that you don’t meet this requirement. The effect
is simple: your model will overfit to the data at hand, and it will become worthless if
it is used with data from the real world.

Since increasing dimensionality equals an increasingly growing need for more data,
the only way out of this curse is to reduce the number of dimensions in our dataset.
This is called Dimensionality Reduction, and we’ll now take a look at two
approaches — Feature Selection and Feature Extraction.

Dimensionality Reduction: Feature Selection vs Feature Extraction

We saw that if we want to decrease the odds of overfitting, we must reduce the
dimensionality of our data. While this can easily be done in theory (we can simply
cut off a few dimensions, who cares?), this gets slightly difficult in practice (which
dimension to cut… because, how do I know which one contributes most?).

https://medium.com/@francescofranco_39234/principal-component-analysis-with-python-a-deep-dive-0c5195bff087#id_token=eyJhbGciOiJSUzI… 7/36
10/28/24, 9:39 AM Principal Component Analysis with Python (A Deep Dive) | by Francesco Franco | Oct, 2024 | Medium

And what if each dimension contributes an equal amount to the predictive power of the
model? What then?

In the field of Dimensionality Reduction, there are two main approaches that you
can use: Feature Selection and Feature Extraction.

Feature Selection involves “the process of selecting a subset of relevant features

(variables, predictors) for use in model construction” (Wikipedia, 2004). In other
words, Feature Selection approaches attempt to measure the contribution of
each feature, so that you can keep the ones that contribute most. What must be
clear is that your model will be trained with the original variables; however, with
only a few of them.

Feature Selection can be a good idea if you already think that most variance
within your dataset can be explained by a few variables. If the others are truly
unimportant, then you can easily discard them without losing too much
information.

Feature Extraction, on the other hand, “starts from an initial set of measured
data and builds derived values (features) intended to be informative and non-
redundant” (Wikipedia, 2003). In other words, a derived dataset will be built that
can be used for training your Machine Learning model. It is lower-dimensional
compared to the original dataset. It will be as informative as possible (i.e., as
much information from the original dataset is pushed into the new variables)
while non-redundant (i.e., we want to avoid that information present in one
variable in the new dataset is also present in another variable in the new
dataset). In other words, we get a lower-dimensional dataset that explains most
variance in the dataset, while keeping things relatively simple.

Especially in the case where each dimension contributes an equal amount,

Feature Extraction can be preferred over Feature Selection. The same is true if
you have no clue about the contribution of each variable to the model’s
predictive power.

Introducing Principal Component Analysis (PCA)

Now that we are aware of the two approaches, it’s time to get to the point. We’ll now
introduce PCA, a Feature Extraction technique, for dimensionality reduction.

Principal Component Analysis is defined as follows:

https://medium.com/@francescofranco_39234/principal-component-analysis-with-python-a-deep-dive-0c5195bff087#id_token=eyJhbGciOiJSUzI… 8/36
10/28/24, 9:39 AM Principal Component Analysis with Python (A Deep Dive) | by Francesco Franco | Oct, 2024 | Medium

Principal component analysis (PCA) is the process of computing the principal components
and using them to perform a change of basis on the data, sometimes using only the first
few principal components and ignoring the rest.

Wikipedia (2002)

Well, that’s quite a technical description, isn’t it. And what are “principal
components”?

The principal components of a collection of points in a real p-space are a sequence of p

direction vectors, where the ith vector is the direction of a line that best fits the data while
being orthogonal to the first i — 1 vectors.

Wikipedia (2002)

I can perfectly understand it if you still have no idea what PCA is after reading those
two quotes. I felt the same. For this reason, let’s break down stuff step-by-step.

The goal of PCA: finding a set of vectors (principal components) that best describe
the spread and direction of your data across its many dimensions, allowing you to
subsequently pick the top-n best-describing ones for reducing the dimensionality of
your feature space.

The steps of PCA:

1. If you have a dataset, its spread can be expressed in orthonormal vectors — the
principal directions of the dataset. Orthonormal, here, means that the vectors
are orthogonal to each other (i.e. they have an angle of 90°) and are of size 1.

2. By sorting these vectors in order of importance (by looking at their relative

contribution to the spread of the data as a whole), we can find the dimensions of
the data which explain most variance.

3. We can then reduce the number of dimensions to the most important ones only.

4. And finally, we can project our dataset onto these new dimensions, called the
principal components, performing dimensionality reduction without losing
much of the information present in the dataset.

The how:

https://medium.com/@francescofranco_39234/principal-component-analysis-with-python-a-deep-dive-0c5195bff087#id_token=eyJhbGciOiJSUzI… 9/36
10/28/24, 9:39 AM Principal Component Analysis with Python (A Deep Dive) | by Francesco Franco | Oct, 2024 | Medium

Although we will explain how later in this article, we’ll now visually walk through
performing PCA at a high level. This allows you to understand what happens first,
before we dive into how it happens.

Another important note is that for step (1), decomposing your dataset into vectors
can be done in two different ways — by means of (a) eigenvector decomposition of
the covariance matrix, or (b) Singular Value Decomposition. Later in this article,
we’ll walk through both approaches step-by-step.

Expressing the spread of your dataset in vectors

Suppose that we generate a dataset based on two overlapping blobs which we

consider to be part of just one dataset:

from sklearn.datasets import make_blobs

import matplotlib.pyplot as plt
import numpy as np

# Configuration options
num_samples_total = 1000
cluster_centers = [(1,1), (1.25,1.5)]
num_classes = len(cluster_centers)

# Generate data
X, y = make_blobs(n_samples = num_samples_total, centers = cluster_centers, n_f
e)

# Make plot
plt.scatter(X[:, 0], X[:, 1])
axes = plt.gca()
axes.set_xlim([0, 2])
axes.set_ylim([0, 2])
plt.show()

…which looks as follows:

https://medium.com/@francescofranco_39234/principal-component-analysis-with-python-a-deep-dive-0c5195bff087#id_token=eyJhbGciOiJSU… 10/36
10/28/24, 9:39 AM Principal Component Analysis with Python (A Deep Dive) | by Francesco Franco | Oct, 2024 | Medium

If you look closely at the dataset, you can see that it primarily spreads into two
directions. These directions are from the upper right corner to the lower left corner
and from the lower right middle to the upper left middle. Those directions are
different from the axis directions, which are orthogonal to each other: the x and y
axes have an angle of 90 degrees.

No other set of directions will explain as much as the variance than the one we
mentioned above.

After standarization (future article), we can visualize the directions as a pair of two
vectors. These vectors are called the principal directions of the data (StackExchange,
n.d.). There are as many principal directions as the number of dimensions; in our
case, there are two.

https://medium.com/@francescofranco_39234/principal-component-analysis-with-python-a-deep-dive-0c5195bff087#id_token=eyJhbGciOiJSU… 11/36
10/28/24, 9:39 AM Principal Component Analysis with Python (A Deep Dive) | by Francesco Franco | Oct, 2024 | Medium

We call these vectors eigenvectors. Their length is represented by what is known as

an eigenvalue. They play a big role in PCA for the following reason:

[The eigenvectors and related] eigenvalues explain the variance of the data along the new
feature axes.

Raschka (2015)

In other words, they allow us to capture both the (1) direction and (2) the magnitude
of the spread in your dataset.

Notice that the vectors are orthogonal to each other. Also recall that our axes are
orthogonal to each other. You can perhaps now imagine that it becomes possible to
perform a transformation to your dataset, so that the directions of the axes are equal
to the directions of the eigenvectors. In other words, we change the “viewpoint” of
our data, so that the axes and vectors have equal directions.

This is the core of PCA: projecting the data to our principal directions, which are
then called principal components.

https://medium.com/@francescofranco_39234/principal-component-analysis-with-python-a-deep-dive-0c5195bff087#id_token=eyJhbGciOiJSU… 12/36
10/28/24, 9:39 AM Principal Component Analysis with Python (A Deep Dive) | by Francesco Franco | Oct, 2024 | Medium

The benefit here is that while the eigenvectors tell us something about the directions
of our projection, the corresponding eigenvalues tell us something about the
importance of that particular principal direction in explaining the variance of the
dataset. It allows us to easily discard the directions that don’t contribute sufficiently.
That’s why before projecting the dataset onto the principal components, we must
first sort the vectors and reduce the number of dimensions.

Sorting the vectors in order of importance

Once we know the eigenvectors and eigenvalues that explain the spread of our
dataset, we must sort them in order of descending importance. This allows us to
perform dimensionality reduction, as we can keep the principal directions which
contribute most significantly to the spread in our dataset.

Sorting is simple: we sort the list with eigenvalues in descending order and ensure
that our list with eigenvectors is sorted in the same way. In other words, the pairs of
eigenvectors and eigenvalues are jointly sorted in a descending order based on the
eigenvalue. As the largest eigenvalues indicate the biggest explanation for spread in
your dataset, they must be on top of the list.

For the example above, we can see that the eigenvalue for the downward-oriented
eigenvector exceeds the one for the upward-oriented vector. If we draw a line
through the dataset that overlaps with the vector, we can also see that variance for
that line as a whole (where variance is defined as the squared distance of each point
to the mean value for the line) is biggest. We can simply draw no line where
variance is larger.

In fact, the total (relative) contribution of the eigenvectors to the spread for our
example is as follows:

[0.76318124 0.23681876]

(We’ll look at how we can determine this later.)

So, for our example above, we now have a sorted list with eigenpairs.

Reducing the number of dimensions

https://medium.com/@francescofranco_39234/principal-component-analysis-with-python-a-deep-dive-0c5195bff087#id_token=eyJhbGciOiJSU… 13/36
10/28/24, 9:39 AM Principal Component Analysis with Python (A Deep Dive) | by Francesco Franco | Oct, 2024 | Medium

As we saw above, the first eigenpair explains 76.3% of the spread in our dataset,
whereas the second one explains only 23.7%. Jointly, they explain 100% of the
spread, which makes sense.

Using PCA for dimensionality reduction now allows us to take the biggest-
contributing vectors (if your original feature space was say 10-dimensional, it is
likely that you can find a smaller set of vectors which explains most of the variance)
and only move forward with them.

If our goal was to reduce dimensionality to one, we would now move forward and
take the 0.763 contributing eigenvector for data projection. Note that this implies
that we will lose 0.237 worth of information about our spread, but in return get a
lower number of dimensions.

Clearly, this example with only two dimensions makes no rational sense as two
dimensions can easily be handled by Machine Learning algorithms, but this is
incredibly useful if you have many dimensions to work with.

Projecting the dataset

Once we have chosen the number of eigenvectors that we will use for
dimensionality reduction (i.e. our target number of dimensions), we can project the
data onto the principal components — or component, in our case.

This means that we will be changing the axes so that they are now equal to the
eigenvectors.

In the example below, we can project our data to one eigenvector. We can see that
only the x axis has values after projecting, and that hence our feature space has
been reduced to one dimension.

We have thus used PCA for dimensionality reduction.

https://medium.com/@francescofranco_39234/principal-component-analysis-with-python-a-deep-dive-0c5195bff087#id_token=eyJhbGciOiJSU… 14/36
10/28/24, 9:39 AM Principal Component Analysis with Python (A Deep Dive) | by Francesco Franco | Oct, 2024 | Medium

The how of generating eigenpairs: Eigenvector Decomposition or Singular Value

Decomposition

Above, we covered the general steps of performing Principal Component Analysis.

Recall that they are as follows:

1. Decomposing the dataset into a set of eigenpairs.

2. Sorting the eigenpairs in descending order of importance.

3. Selecting n most important eigenpairs, where n is the desired number of

dimensions.

4. Projecting the data to the n eigenpairs so that their directions equal the ones of
our axes.

In step (1), we simply mentioned that we can express the spread of our data by
means of eigenpairs. On purpose, we didn’t explain how this can be done, for the
sake of simplicity.

https://medium.com/@francescofranco_39234/principal-component-analysis-with-python-a-deep-dive-0c5195bff087#id_token=eyJhbGciOiJSU… 15/36
10/28/24, 9:39 AM Principal Component Analysis with Python (A Deep Dive) | by Francesco Franco | Oct, 2024 | Medium

In fact, there are two methods that are being used for this purpose today:
Eigenvector Decomposition (often called “EIG”) and Singular Value Decomposition
(“SVD”). Using different approaches, they can be used to obtain the same end result:
expressing the spread of your dataset in eigenpairs, the principal directions of your
data, which can subsequently be used to reduce the number of dimensions by
projecting your dataset to the most important ones, the principal components.

While mathematically and hence formally you can obtain the same result with both,
in practice PCA-SVD is numerically more stable (StackExchange, n.d.). For this
reason, you will find that most libraries and frameworks favor a PCA-SVD
implementation over a PCA-EIG one. Nevertheless, you can still achieve the same
result with both approaches!

In the next section, we will take a look at a clear and step-by-step example of PCA
with EIG and, in a future post, we’ll look at PCA with SVD, allowing you to
understand the differences intuitively. We will then look at
sklearn.decomposition.PCA , Scikit-learn's implementation of Principal Component
Analysis based on PCA-SVD. There is no need to perform PCA manually if there are
great tools out there, after all!

PCA-EIG: Eigenvector Decomposition with Python Step-by-Step

One of the ways in which PCA can be performed is by means of Eigenvector
Decomposition (EIG). More specifically, we can use the covariance matrix of our N-
dimensional dataset and decompose it into N eigenpairs. We can do this as follows:

1. Standardizing the dataset: EIG based PCA only works well if the dataset is
centered and has a mean of zero (i.e. μ = 0.0). We will use standardization for
this purpose, which also scales the data to a standard deviation of one (σ= 1.0).

2. Computing the covariance matrix of the variables: a covariance matrix indicates

how much variance each individual variable has, and how much they ‘covary’ —
in other words, how much certain variables move together.

3. Decomposing the covariance matrix into eigenpairs: mathematically, we can

rewrite the covariance matrix so that we can get a set of eigenvectors and
eigenvalues, or eigenpairs.

4. Sorting the eigenpairs in decreasing order of importance, to find the principal

directions in your dataset which contribute to the spread most significantly.

https://medium.com/@francescofranco_39234/principal-component-analysis-with-python-a-deep-dive-0c5195bff087#id_token=eyJhbGciOiJSU… 16/36
10/28/24, 9:39 AM Principal Component Analysis with Python (A Deep Dive) | by Francesco Franco | Oct, 2024 | Medium

5. Selecting the variance contribution of your principal directions and selecting n

principal components: if we know the relative contributions to the spread for
each principal direction, we can perform dimensionality reduction by selecting
only the n most contributing principal components.

6. Building the projection matrix for projecting our original dataset onto the
principal components.

We can see that steps (1), (4), (5) and (6) are general — we also saw them above. Steps
(2) and (3) are specific to PCA-EIG and represent the core of what makes eigenvector
decomposition based PCA unique. We will now cover each step in more detail,
including step-by-step examples with Python. Note that the example in this section
makes use of native / vanilla Python deliberately, and that Scikit-learn based
implementations of, e.g., standardization and PCA will be used in another section.

Using the multidimensional Iris dataset

If we want to show how PCA works, we must use a dataset where the number of
dimensions > 2. Fortunately, Scikit-learn provides the Iris dataset, which can be
used to classify three groups of Iris flowers based on four characteristics (and hence
features or dimensions): petal length, petal width, sepal length and sepal width.

This code can be used for visualizing two dimensions every time:

from sklearn import datasets

import matplotlib.pyplot as plt
import numpy as np

# Configuration options
dimension_one = 1
dimension_two = 3

# Load Iris dataset

iris = datasets.load_iris()
X = iris.data
y = iris.target

# Shape
print(X.shape)
print(y.shape)

# Dimension definitions
dimensions = {
https://medium.com/@francescofranco_39234/principal-component-analysis-with-python-a-deep-dive-0c5195bff087#id_token=eyJhbGciOiJSU… 17/36
10/28/24, 9:39 AM Principal Component Analysis with Python (A Deep Dive) | by Francesco Franco | Oct, 2024 | Medium

0: 'Sepal Length',
1: 'Sepal Width',
2: 'Petal Length',
3: 'Petal Width'
}

# Color definitions
colors = {
0: '#b40426',
1: '#3b4cc0',
2: '#f2da0a',
}

# Legend definition
legend = ['Iris Setosa', 'Iris Versicolour', 'Iris Virginica']

# Make plot
colors = list(map(lambda x: colors[x], y))
plt.scatter(X[:, dimension_one], X[:, dimension_two], c=colors)
plt.title(f'Visualizing dimensions {dimension_one} and {dimension_two}')
plt.xlabel(dimensions[dimension_one])
plt.ylabel(dimensions[dimension_two])
plt.show()

This yields the following plots, if we play with the dimensions:

https://medium.com/@francescofranco_39234/principal-component-analysis-with-python-a-deep-dive-0c5195bff087#id_token=eyJhbGciOiJSU… 18/36
10/28/24, 9:39 AM Principal Component Analysis with Python (A Deep Dive) | by Francesco Franco | Oct, 2024 | Medium

https://medium.com/@francescofranco_39234/principal-component-analysis-with-python-a-deep-dive-0c5195bff087#id_token=eyJhbGciOiJSU… 19/36
10/28/24, 9:39 AM Principal Component Analysis with Python (A Deep Dive) | by Francesco Franco | Oct, 2024 | Medium

The images illustrate that two of the Iris flowers cannot be linearly separated, but
that this group can be separated from the other Iris flower. Printing the shape yields
the following:

(150, 4)
(150,)

…indicating that we have only 150 samples, but that our feature space is four-
dimensional. Clearly a case where feature extraction could be beneficial for training
our Machine Learning model.

Performing standardization

We first add Python code for standardization, which brings our data to μ=0.0, σ= 1.0
by performing x =( x —μ) /σ for each dimension.

# Perform standardization
for dim in range(0, X.shape[1]):
print(f'Old mean/std for dim={dim}: {np.average(X[:, dim])}/{np.std(X[:, dim]
X[:, dim] = (X[:, dim] - np.average(X[:, dim])) / np.std(X[:, dim])
print(f'New mean/std for dim={dim}: {np.abs(np.round(np.average(X[:, dim])))}

And indeed:

Old mean/std for dim=0: 5.843333333333334/0.8253012917851409

New mean/std for dim=0: 0.0/1.0
Old mean/std for dim=1: 3.0573333333333337/0.4344109677354946
New mean/std for dim=1: 0.0/0.9999999999999999
Old mean/std for dim=2: 3.7580000000000005/1.759404065775303

https://medium.com/@francescofranco_39234/principal-component-analysis-with-python-a-deep-dive-0c5195bff087#id_token=eyJhbGciOiJSU… 20/36
10/28/24, 9:39 AM Principal Component Analysis with Python (A Deep Dive) | by Francesco Franco | Oct, 2024 | Medium

New mean/std for dim=2: 0.0/1.0

Old mean/std for dim=3: 1.1993333333333336/0.7596926279021594
New mean/std for dim=3: 0.0/1.0

Computing the covariance matrix of your variable

The next step is computing the covariance matrix for our dataset.

In probability theory and statistics, a covariance matrix (…) is a square matrix giving the
covariance between each pair of elements of a given random vector.

Wikipedia (2003)

If you’re not into mathematics, I can understand that you don’t know what this is yet.
Let’s therefore briefly take a look at a few aspects related to a covariance matrix
before we move on, based on Lambers (n.d.).

A variable: such as X. A mathematical representation of one dimension of the data

set. For example, if X represents {petal width}, numbers such as 1.19, 1.20, 1.21,
1.18, 1.16, … which represent the petal width for one flower can all be described by
variable X.

Variable mean: the average value for the variable. Computed as the sum of all
available values divided by the number of values summed together. As petal width
represents dim=3 in the visualization above, with a mean of ≈ 1.1993, we can see
how the numbers above fit.

Variance: describing the “spread” of data around the variable. Computed as the sum
of squared differences between each number and the mean, i.e. the sum of (x- μ)²for
each number.

Covariance: describing the joint variability (or joint spread) of two variables. For
each pair of numbers from both variables, covariance is computed as:

Covariance matrix for n variables: a matrix representing covariances for each pair
of variables from some set of variables (dimensions) V = [X, Y, Z, ….].

https://medium.com/@francescofranco_39234/principal-component-analysis-with-python-a-deep-dive-0c5195bff087#id_token=eyJhbGciOiJSU… 21/36
10/28/24, 9:39 AM Principal Component Analysis with Python (A Deep Dive) | by Francesco Franco | Oct, 2024 | Medium

A covariance matrix for two dimensions X and Y looks as follows:

Fortunately, there are some properties which make covariance matrices interesting
for PCA (Lambers, n.d.):

[Cov(X, X) = Var(X)

Cov(X, Y) = Cov(Y, X)

As a consequence, our covariance matrix is a symmetrical and square, n x n matrix

and can hence also be written as follows:

We can compute the covariance matrix by generating an (n x n) matrix and then

filling it by iterating over its rows and columns, setting the value to the average
covariance for each respective number from both variables:

# Compute covariance matrix

cov_matrix = np.empty((X.shape[1], X.shape[1])) # 4 x 4 matrix
for row in range(0, X.shape[1]):
for col in range(0, X.shape[1]):
cov_matrix[row][col] = np.round(np.average([(X[i, row] - np.average(X[:, ro
- np.average(X[:, col])) for i in range(0, X.shape[0])]), 2)

If we compare our self-computed covariance matrix with one generated with

NumPy’s np.cov , we can see the similarities:

# Compare the matrices

print('Self-computed:')
print(cov_matrix)
https://medium.com/@francescofranco_39234/principal-component-analysis-with-python-a-deep-dive-0c5195bff087#id_token=eyJhbGciOiJSU… 22/36
10/28/24, 9:39 AM Principal Component Analysis with Python (A Deep Dive) | by Francesco Franco | Oct, 2024 | Medium

print('NumPy-computed:')
print(np.round(np.cov(X.T), 2))

> Self-computed:
> [[ 1. -0.12 0.87 0.82]
> [-0.12 1. -0.43 -0.37]
> [ 0.87 -0.43 1. 0.96]
> [ 0.82 -0.37 0.96 1. ]]
> NumPy-computed:
> [[ 1.01 -0.12 0.88 0.82]
> [-0.12 1.01 -0.43 -0.37]
> [ 0.88 -0.43 1.01 0.97]
> [ 0.82 -0.37 0.97 1.01]]

Decomposing the covariance matrix into eigenvectors and eigenvalues

Above, we have expressed the spread of our dataset across the dimensions in our
covariance matrix. Recall that PCA works by expressing this spread in terms of
vectors, called eigenvectors, which together with their corresponding eigenvalues
tell us something about the direction and magnitude of the spread.

The great thing of EIG-PCA is that we can decompose the covariance matrix into
eigenvectors and eigenvalues.

We can do this as follows:

Here, V is a matrix of eigenvectors where each column is an eigenvector, L is a

diagonal matrix with eigenvalues and V^top is the transpose of V.

We can use NumPy’s numpy.linalg.eig to compute the eigenvectors for this square
array:

# Compute the eigenpairs

eig_vals, eig_vect = np.linalg.eig(cov_matrix)
print(eig_vect)
print(eig_vals)

This yields the following:

https://medium.com/@francescofranco_39234/principal-component-analysis-with-python-a-deep-dive-0c5195bff087#id_token=eyJhbGciOiJSU… 23/36
10/28/24, 9:39 AM Principal Component Analysis with Python (A Deep Dive) | by Francesco Franco | Oct, 2024 | Medium

[[ 0.52103086 -0.37921152 -0.71988993 0.25784482]

[-0.27132907 -0.92251432 0.24581197 -0.12216523]
[ 0.57953987 -0.02547068 0.14583347 -0.80138466]
[ 0.56483707 -0.06721014 0.63250894 0.52571316]]
[2.91912926 0.91184362 0.144265 0.02476212]

If we compute how much each principal dimension contributes to variance

explanation, we get the following:

# Compute variance contribution of each vector

contrib_func = np.vectorize(lambda x: x / np.sum(eig_vals))
var_contrib = contrib_func(eig_vals)
print(var_contrib)
print(np.sum(var_contrib))
# > [0.72978232 0.2279609 0.03606625 0.00619053]
# > 1.0

https://medium.com/@francescofranco_39234/principal-component-analysis-with-python-a-deep-dive-0c5195bff087#id_token=eyJhbGciOiJSU… 24/36
10/28/24, 9:39 AM Principal Component Analysis with Python (A Deep Dive) | by Francesco Franco | Oct, 2024 | Medium

In other words, the first principal dimension contributes for 73%; the second one
for 23%. If we therefore reduce the dimensionality to two, we get to keep
approximately 73 + 23 = 96% of the variance explanation.

Sorting the eigenpairs in decreasing order of importance

Even though the eigenpairs above have already been sorted, it’s a thing we must
definitely do — especially when you perform the decomposition in eigenpairs in a
different way.

Sorting the eigenpairs happens by eigenvalue: the eigenvalues must be sorted in a

descending way; the corresponding eigenvectors must therefore also be sorted
equally.

# Sort eigenpairs
eigenpairs = [(np.abs(eig_vals[x]), eig_vect[:,x]) for x in range(0, len(eig_va
eig_vals = [eigenpairs[x][0] for x in range(0, len(eigenpairs))]
eig_vect = [eigenpairs[x][1] for x in range(0, len(eigenpairs))]
print(eig_vals)

This yields sorted eigenpairs, as we can see from the eigenvalues:

[2.919129264835876, 0.9118436180017795, 0.14426499504958146, 0.0247621221127632

Selecting n principal components

Above, we saw that 96% of the variance can be explained by only two of the
dimensions. We can therefore reduce the dimensionality of our feature space from
n = 4 to n = 2 without losing much of the information.

Building the projection matrix

The final thing we must do is generate the projection matrix and project our
original data onto the (two) principal components (Raschka, 2015):

https://medium.com/@francescofranco_39234/principal-component-analysis-with-python-a-deep-dive-0c5195bff087#id_token=eyJhbGciOiJSU… 25/36
10/28/24, 9:39 AM Principal Component Analysis with Python (A Deep Dive) | by Francesco Franco | Oct, 2024 | Medium

# Build the projection matrix

proj_matrix = np.hstack((eig_vect[0].reshape(4,1), eig_vect[1].reshape(4,1)))
print(proj_matrix)

# Project onto the principal components

X_proj = X.dot(proj_matrix)

Voilà, you’ve performed PCA

If we now plot the projected data, we get the following plot:

# Make plot of projection

plt.scatter(X_proj[:, 0], X_proj[:, 1], c=colors)
plt.title(f'Visualizing the principal components')
plt.xlabel('Principal component 1')
plt.ylabel('Principal component 2')
plt.show()

https://medium.com/@francescofranco_39234/principal-component-analysis-with-python-a-deep-dive-0c5195bff087#id_token=eyJhbGciOiJSU… 26/36
10/28/24, 9:39 AM Principal Component Analysis with Python (A Deep Dive) | by Francesco Franco | Oct, 2024 | Medium

That’s it! You just performed Principal Component Analysis using Eigenvector
Decomposition and have reduced dimensionality to two without losing much of the
information in the dataset.

Summary
In this article, we read about performing Principal Component Analysis on the
dimensions of your dataset for the purpose of dimensionality reduction. Some
datasets have many features and few samples, meaning that many Machine
Learning algorithms will be struck by the curse of dimensionality. Feature
extraction approaches like PCA, which attempt to construct a lower-dimensional
feature space based on the original dataset, can help reduce this curse. Using PCA,
we can attempt to recreate our feature space with fewer dimensions and with
minimum information loss.

After defining the context for applying PCA, we looked at it from a high-level
perspective. We saw that we can compute eigenvectors and eigenvalues and sort
those to find the principal directions in your dataset. After generating a projection
matrix for these directions, we can map our dataset onto these directions, which are
then called the principal components. But how these eigenvectors can be derived
was explained later, because there are two methods for doing so: using Eigenvector
Decomposition (EIG) and the more generalized Singular Value Decomposition
(SVD).

In two step-by-step examples, we saw how we can apply both PCA-EIG (this post)
and PCA-SVD (next post) for performing a Principal Component Analysis. In the
first case, we saw that we can compute a covariance matrix for the standardized
dataset which illustrates the variances and covariances of its variables. This matrix
can then be decomposed into eigenvectors and eigenvalues, which illustrate the
direction and magnitude of the spread expressed by the covariance matrix. Sorting
the eigenpairs, we can select the principal directions that contribute most to
variance, generate the projection matrix and project our data.

While PCA-EIG works well with symmetric and square matrices (and hence with our
covariance matrix), it can be numerically unstable. That’s why PCA-SVD is very
common in today’s Machine Learning libraries. In another step-by-step example, we
will look at how the SVD can be used directly on the standardized data matrix for
deriving the eigenvectors we also found with PCA-EIG. They can be used for

https://medium.com/@francescofranco_39234/principal-component-analysis-with-python-a-deep-dive-0c5195bff087#id_token=eyJhbGciOiJSU… 27/36
10/28/24, 9:39 AM Principal Component Analysis with Python (A Deep Dive) | by Francesco Franco | Oct, 2024 | Medium

generating a projection matrix which allowed us to arrive at the same end result as
when performing PCA-EIG.

It’s been a thorough read, that’s for sure. Still I hope that you have learned
something. Any comments, questions or suggestions are welcome. Thank you for
reading!!

References
Wikipedia. (n.d.). Curse of dimensionality. Wikipedia, the free encyclopedia.
Retrieved December 3, 2020, from
https://en.wikipedia.org/wiki/Curse_of_dimensionality

Wikipedia. (2004, November 17). Feature selection. Wikipedia, the free encyclopedia.
Retrieved December 3, 2020, from https://en.wikipedia.org/wiki/Feature_selection

Wikipedia. (2003, June 8). Feature extraction. Wikipedia, the free encyclopedia.
Retrieved December 3, 2020, from https://en.wikipedia.org/wiki/Feature_extraction

Wikipedia. (2002, August 26). Principal component analysis. Wikipedia, the free
encyclopedia. Retrieved December 7, 2020, from
https://en.wikipedia.org/wiki/Principal_component_analysis

StackExchange. (n.d.). Relationship between SVD and PCA. How to use SVD to perform
PCA? Cross Validated.
https://stats.stackexchange.com/questions/134282/relationship-between-svd-and-
pca-how-to-use-svd-to-perform-pca

Wikipedia. (n.d.). Eigenvalues and eigenvectors. Wikipedia, the free encyclopedia.

Retrieved December 7, 2020, from
https://en.wikipedia.org/wiki/Eigenvalues_and_eigenvectors

Lambers, J. V. (n.d.). PCA — Mathematical Background.

https://www.math.usm.edu/lambers/cos702/cos702_files/docs/PCA.pdf

Raschka, S. (2015, January 27). Principal component analysis. Dr. Sebastian Raschka.
https://sebastianraschka.com/Articles/2015_pca_in_3_steps.html

Machine Learning Principal Component Data Science Computer Science

Technology

https://medium.com/@francescofranco_39234/principal-component-analysis-with-python-a-deep-dive-0c5195bff087#id_token=eyJhbGciOiJSU… 28/36
10/28/24, 9:39 AM Principal Component Analysis with Python (A Deep Dive) | by Francesco Franco | Oct, 2024 | Medium

Written by Francesco Franco

617 Followers

BS in Computer Science | studying ML| amatuer philosopher| compulsive reader

More from Francesco Franco

https://medium.com/@francescofranco_39234/principal-component-analysis-with-python-a-deep-dive-0c5195bff087#id_token=eyJhbGciOiJSU… 29/36

Kakuro and Sudoku Killer Combinations PDF
100% (1)
Kakuro and Sudoku Killer Combinations PDF
1 page
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet
Python Machine Learning For Beginners: Handbook For Machine Learning, Deep Learning And Neural Networks Using Python, Scikit-Learn And TensorFlow
From Everand
Python Machine Learning For Beginners: Handbook For Machine Learning, Deep Learning And Neural Networks Using Python, Scikit-Learn And TensorFlow
Finn Sanders
No ratings yet
Practical Monte Carlo Simulation with Excel - Part 2 of 2: Applications and Distributions
From Everand
Practical Monte Carlo Simulation with Excel - Part 2 of 2: Applications and Distributions
Akram Najjar
2/5 (1)
Python Parallel Programming Cookbook
From Everand
Python Parallel Programming Cookbook
Giancarlo Zaccone
5/5 (1)
Practical Monte Carlo Simulation with Excel - Part 1 of 2: Basics and Standard Procedures
From Everand
Practical Monte Carlo Simulation with Excel - Part 1 of 2: Basics and Standard Procedures
Akram Najjar
No ratings yet
Machine Learning with Python: A Comprehensive Guide with a Practical Example
From Everand
Machine Learning with Python: A Comprehensive Guide with a Practical Example
MARTIN NEEL
No ratings yet
Machine Learning - A Comprehensive, Step-by-Step Guide to Learning and Applying Advanced Concepts and Techniques in Machine Learning: 3
From Everand
Machine Learning - A Comprehensive, Step-by-Step Guide to Learning and Applying Advanced Concepts and Techniques in Machine Learning: 3
Peter Bradley
No ratings yet
Learn Machine Learning in 24 Hours: The Ultimate Beginner’s Guide: Master Coding in 24 Hours
From Everand
Learn Machine Learning in 24 Hours: The Ultimate Beginner’s Guide: Master Coding in 24 Hours
Aniket Jain
No ratings yet
Principle Component Analysis (PCA) : Purpose of This Project
No ratings yet
Principle Component Analysis (PCA) : Purpose of This Project
30 pages
Accelerate Model Training with PyTorch 2.X: Build more accurate models by boosting the model training process
From Everand
Accelerate Model Training with PyTorch 2.X: Build more accurate models by boosting the model training process
Maicon Melo Alves
No ratings yet
Data Preprocessing
No ratings yet
Data Preprocessing
84 pages
Natural Computing with Python: Learn to implement genetic and evolutionary algorithms to solve problems in a pythonic way
From Everand
Natural Computing with Python: Learn to implement genetic and evolutionary algorithms to solve problems in a pythonic way
Giancarlo Zaccone
No ratings yet
Practical Full Stack Machine Learning: A Guide to Build Reliable, Reusable, and Production-Ready Full Stack ML Solutions
From Everand
Practical Full Stack Machine Learning: A Guide to Build Reliable, Reusable, and Production-Ready Full Stack ML Solutions
Alok Kumar
No ratings yet
Principal Component Analysis: #Datascience
No ratings yet
Principal Component Analysis: #Datascience
13 pages
Pragmatic Machine Learning with Python: Learn How to Deploy Machine Learning Models in Production
From Everand
Pragmatic Machine Learning with Python: Learn How to Deploy Machine Learning Models in Production
Avishek Nag
No ratings yet
The Supervised Learning Workshop - Second Edition: A New, Interactive Approach to Understanding Supervised Learning Algorithms, 2nd Edition
From Everand
The Supervised Learning Workshop - Second Edition: A New, Interactive Approach to Understanding Supervised Learning Algorithms, 2nd Edition
Blaine Bateman
No ratings yet
Mastering Classification Algorithms for Machine Learning: Learn how to apply Classification algorithms for effective Machine Learning solutions (English Edition)
From Everand
Mastering Classification Algorithms for Machine Learning: Learn how to apply Classification algorithms for effective Machine Learning solutions (English Edition)
PARTHA MAJUMDAR
No ratings yet
Love Report
No ratings yet
Love Report
7 pages
Image Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning
From Everand
Image Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning
Mark Magic
No ratings yet
PYTHON MACHINE LEARNING: Leveraging Python for Implementing Machine Learning Algorithms and Applications (2023 Guide)
From Everand
PYTHON MACHINE LEARNING: Leveraging Python for Implementing Machine Learning Algorithms and Applications (2023 Guide)
Roberta Bowman
No ratings yet
Exploring the World of Data Science and Machine Learning
From Everand
Exploring the World of Data Science and Machine Learning
NIBEDITA Sahu
No ratings yet
Love Report 1
No ratings yet
Love Report 1
10 pages
10 Autoencoders
No ratings yet
10 Autoencoders
42 pages
Python for Machine Learning: From Fundamentals to Real-World Applications
From Everand
Python for Machine Learning: From Fundamentals to Real-World Applications
Kameron Hussain
No ratings yet
Mastering Machine Learning Algorithms - Second Edition: Expert techniques for implementing popular machine learning algorithms, fine-tuning your models, and understanding how they work, 2nd Edition
From Everand
Mastering Machine Learning Algorithms - Second Edition: Expert techniques for implementing popular machine learning algorithms, fine-tuning your models, and understanding how they work, 2nd Edition
Giuseppe Bonaccorso
2/5 (1)
Feature Extraction: 4.1. Principal Component Analysis (PCA)
No ratings yet
Feature Extraction: 4.1. Principal Component Analysis (PCA)
10 pages
Pca 2
No ratings yet
Pca 2
3 pages
PYTHON CODING: Become a Coder Fast. Machine Learning, Data Analysis Using Python, Code-Creation Methods, and Beginner's Programming Tips and Tricks (2022 Crash Course for Newbies)
From Everand
PYTHON CODING: Become a Coder Fast. Machine Learning, Data Analysis Using Python, Code-Creation Methods, and Beginner's Programming Tips and Tricks (2022 Crash Course for Newbies)
Pierce Weaver
2/5 (1)
Mastering Data Structures and Algorithms in Python & Java
From Everand
Mastering Data Structures and Algorithms in Python & Java
Sachin Naha
No ratings yet
P-3.1.4 - Pca
No ratings yet
P-3.1.4 - Pca
44 pages
Learn Design and Analysis of Algorithms in 24 Hours
From Everand
Learn Design and Analysis of Algorithms in 24 Hours
Alex Nordeen
No ratings yet
Mastering matplotlib
From Everand
Mastering matplotlib
Duncan M. McGreggor
No ratings yet
Python Machine Learning: A Practical Beginner's Guide to Understanding Machine Learning, Deep Learning and Neural Networks with Python, Scikit-Learn, Tensorflow and Keras
From Everand
Python Machine Learning: A Practical Beginner's Guide to Understanding Machine Learning, Deep Learning and Neural Networks with Python, Scikit-Learn, Tensorflow and Keras
Brandon Railey
No ratings yet
Deep learning: deep learning explained to your granny – a guide for beginners
From Everand
Deep learning: deep learning explained to your granny – a guide for beginners
PAT NAKAMOTO
3/5 (2)
Practical C++ Machine Learning: Hands-on strategies for developing simple machine learning models using C++ data structures and libraries
From Everand
Practical C++ Machine Learning: Hands-on strategies for developing simple machine learning models using C++ data structures and libraries
Anais Sutherland
No ratings yet
Programming Concepts in C++
From Everand
Programming Concepts in C++
Robert Burns
No ratings yet
Assignment
No ratings yet
Assignment
24 pages
Building Transformer Models with PyTorch 2.0: NLP, computer vision, and speech processing with PyTorch and Hugging Face (English Edition)
From Everand
Building Transformer Models with PyTorch 2.0: NLP, computer vision, and speech processing with PyTorch and Hugging Face (English Edition)
Prem Timsina
No ratings yet
Learning PyTorch 2.0, Second Edition
From Everand
Learning PyTorch 2.0, Second Edition
Matthew Rosch
No ratings yet
Learning PyTorch 2.0, Second Edition: Utilize PyTorch 2.3 and CUDA 12 to experiment neural networks and deep learning models
From Everand
Learning PyTorch 2.0, Second Edition: Utilize PyTorch 2.3 and CUDA 12 to experiment neural networks and deep learning models
Matthew Rosch
No ratings yet
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
From Everand
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
Fouad Sabry
No ratings yet
Deep Learning With Python Illustrated Guide For Beginners & Intermediates: The Future Is Here!: The Future Is Here!, #2
From Everand
Deep Learning With Python Illustrated Guide For Beginners & Intermediates: The Future Is Here!: The Future Is Here!, #2
William Sullivan
1/5 (1)
Action Recognition: Step-by-step Recognizing Actions with Python and Recurrent Neural Network
From Everand
Action Recognition: Step-by-step Recognizing Actions with Python and Recurrent Neural Network
Mark Magic
No ratings yet
IRJMETS443407
No ratings yet
IRJMETS443407
7 pages
Stacked Autoencoders. - Towards Data Science
No ratings yet
Stacked Autoencoders. - Towards Data Science
9 pages
Wk01 Machine Learning
No ratings yet
Wk01 Machine Learning
6 pages
Machine Learning Upgrade: A Data Scientist's Guide to MLOps, LLMs, and ML Infrastructure
From Everand
Machine Learning Upgrade: A Data Scientist's Guide to MLOps, LLMs, and ML Infrastructure
Kristen Kehrer
No ratings yet
Machine Learning: Hands-On for Developers and Technical Professionals
From Everand
Machine Learning: Hands-On for Developers and Technical Professionals
Jason Bell
No ratings yet
Kinya Sharon - Ass2 - Machine Learning
No ratings yet
Kinya Sharon - Ass2 - Machine Learning
12 pages
Machine Learning in Production: Master the art of delivering robust Machine Learning solutions with MLOps (English Edition)
From Everand
Machine Learning in Production: Master the art of delivering robust Machine Learning solutions with MLOps (English Edition)
Suhas Pote
No ratings yet
Machine Learning and Deep Learning With Python
From Everand
Machine Learning and Deep Learning With Python
James Chen
No ratings yet
PyTorch Cookbook: 100+ Solutions across RNNs, CNNs, python tools, distributed training and graph networks
From Everand
PyTorch Cookbook: 100+ Solutions across RNNs, CNNs, python tools, distributed training and graph networks
Matthew Rosch
No ratings yet
PyTorch Cookbook
From Everand
PyTorch Cookbook
Matthew Rosch
No ratings yet
Advertising in ML
No ratings yet
Advertising in ML
9 pages
Datascience
No ratings yet
Datascience
26 pages
Python for Data Science: A Practical Approach to Machine Learning
From Everand
Python for Data Science: A Practical Approach to Machine Learning
Jarrel E.
No ratings yet
Mathematica Data Analysis
From Everand
Mathematica Data Analysis
Suchok Sergiy
No ratings yet
A COMPLETE GUIDE TO PRINCIPAL COMPONENT ANALYSIS in ML 1598272724
No ratings yet
A COMPLETE GUIDE TO PRINCIPAL COMPONENT ANALYSIS in ML 1598272724
16 pages
Active Machine Learning with Python: Refine and elevate data quality over quantity with active learning
From Everand
Active Machine Learning with Python: Refine and elevate data quality over quantity with active learning
Margaux Masson-Forsythe
No ratings yet
Python Machine Learning Illustrated Guide For Beginners & Intermediates: The Future Is Here!
From Everand
Python Machine Learning Illustrated Guide For Beginners & Intermediates: The Future Is Here!
William Sullivan
5/5 (1)
Conv Net 1
No ratings yet
Conv Net 1
93 pages
Control System Notes m2
No ratings yet
Control System Notes m2
132 pages
TIC 2151 - Theory of Computation: Context-Free Grammars (CFG)
No ratings yet
TIC 2151 - Theory of Computation: Context-Free Grammars (CFG)
23 pages
Lab1 DSP
No ratings yet
Lab1 DSP
15 pages
Iae 2 Answer Key
No ratings yet
Iae 2 Answer Key
4 pages
JETIR2404299
No ratings yet
JETIR2404299
9 pages
Animated Algorithm Visualizer For Graph Based Algorithms and Recursive Programs
No ratings yet
Animated Algorithm Visualizer For Graph Based Algorithms and Recursive Programs
5 pages
Introduction To Quantum Finance: January 2020
No ratings yet
Introduction To Quantum Finance: January 2020
46 pages
Unit - 3
No ratings yet
Unit - 3
55 pages
Unit 12 Unit Test
No ratings yet
Unit 12 Unit Test
7 pages
Abstract Artificial Intelligence Otherwise Known As AI
No ratings yet
Abstract Artificial Intelligence Otherwise Known As AI
11 pages
Model Predictive Control Notes
100% (6)
Model Predictive Control Notes
135 pages
ISSN: 2320-7493 (Online) 2320-8449 (Print)
No ratings yet
ISSN: 2320-7493 (Online) 2320-8449 (Print)
2 pages
06 Smoothing PDF
No ratings yet
06 Smoothing PDF
55 pages
Classification Analysis Report PDF
No ratings yet
Classification Analysis Report PDF
9 pages
Advanced Statistics in Criminology and Criminal Justice 5th Edition David Weisburd David B Wilson Alese Wooditch Chester Britt
100% (3)
Advanced Statistics in Criminology and Criminal Justice 5th Edition David Weisburd David B Wilson Alese Wooditch Chester Britt
40 pages
HyperNova: Recursive Arguments For Customizable Constraint Systems
No ratings yet
HyperNova: Recursive Arguments For Customizable Constraint Systems
40 pages
Baltica Insurance Company LTD., Ballerup, Denmark: by Henrik Ramlau-Hansen
No ratings yet
Baltica Insurance Company LTD., Ballerup, Denmark: by Henrik Ramlau-Hansen
15 pages
National Institute of Technology, Delhi: Numerical Methods (MAP 281) Assignment
No ratings yet
National Institute of Technology, Delhi: Numerical Methods (MAP 281) Assignment
9 pages
Understanding Time Complexity With Simple Examples
No ratings yet
Understanding Time Complexity With Simple Examples
78 pages
Opc Vector Imaging Model
No ratings yet
Opc Vector Imaging Model
9 pages
3muri Brief Theory PDF
No ratings yet
3muri Brief Theory PDF
19 pages
OR Chapter - 3 - 100638
No ratings yet
OR Chapter - 3 - 100638
24 pages
Lec 5-Stacks and Queues
No ratings yet
Lec 5-Stacks and Queues
71 pages
Stanford Dog Classification Using Convolutional Neural Network (CNN)
No ratings yet
Stanford Dog Classification Using Convolutional Neural Network (CNN)
8 pages
Note-Optimal Control Theory - From Surash P. Sethi
No ratings yet
Note-Optimal Control Theory - From Surash P. Sethi
5 pages
Sparse, Stacked and Variational Autoencoder - by Venkata Krishna Jonnalagadda - Medium
No ratings yet
Sparse, Stacked and Variational Autoencoder - by Venkata Krishna Jonnalagadda - Medium
17 pages
ECE 606, Algorithms: Mahesh Tripunitara Tripunit@uwaterloo - Ca ECE, University of Waterloo
No ratings yet
ECE 606, Algorithms: Mahesh Tripunitara Tripunit@uwaterloo - Ca ECE, University of Waterloo
4 pages
Advanced Engineering Mathematics
No ratings yet
Advanced Engineering Mathematics
32 pages