DM1580 - LAB 4 Machine Learning Video
DM1580 - LAB 4 Machine Learning Video
Group 14
Hind Belhouari
Leon Boman
Roger Vendrell Colet
1. Objective
The objective of this lab was to understand how to compute and use eigenfaces for
image compression and face reconstruction using Principal Component Analysis
(PCA). Specifically, we:
2. The data
The data we have dealt with consists of a set of images depicting human faces. The
data consists of a total of 13233 samples. The amount of samples has been reduced
to 550 (500 as a training set and 50 as a test set).
The following is an example of the images found in the dataset:
We can see that the faces are centered and aligned. Taking the mean image of all
the faces lets us confirm this, revealing that most images depict a darker pattern on
the eyes and mouth, whereas the forehead and cheeks appear with higher intensity
values:
where n is the number of face images. From there, we could compute its
eigenvectors (its main components) and its eigenvalues (which represents the
importance or strength of each component).
The eigenvalue graph shows a rapid decrease in the strength of the eigenvectors of
the covariance matrix. They start to flatten near 0 before the 100th element. This
suggests that taking into account a number of eigenvectors between 0 and 100,
such as 50, is a reasonable choice. Particularly, 50 is 10% of all the available
eigenvectors (which is very valuable reduction from the data compression
perspective), and an amount that still ensures enough expressiveness.
Once we have used the training dataset (500 face images) to find the 50 principal
components that better allow to represent the “face signal”, we have used those to
encode previously unseen faces.
This process is particularly interesting from the data compression perspective,
because it allows us to encode every face image with a fixed amount of coefficients
(in our case, the chosen 50). Without the encoding, each image would need 7500
values (100x75 pixels).
After around k = 50, the curve starts to level off. This means that adding more
eigenfaces doesn't improve the quality much. Between k = 50 and k = 100, the
PSNR only increases slightly ;from about 72 dB to just over 73 dB.
In the end, using about 50 to 70 eigenfaces gives a good balance between quality
and efficiency. The PSNR values stay above 70 dB, which means the reconstructed
faces are very similar to the originals, even when not using all components.