Feature Exploration PCA MNIST
Feature Exploration PCA MNIST
Objective:
- Perform feature exploration using Principal Component Analysis (PCA) on the MNIST dataset.
- Reduce the dimensionality of handwritten digit images while preserving essential features.
- Visualize the reduced features to understand patterns and clusters in the data.
Application Domain:
- Computer Vision: Reducing image dimensions for efficient classification and clustering.
Target:
Dataset:
- Description: MNIST consists of 70,000 grayscale images of handwritten digits (0-9). Each image is
datasets module.
Description:
high-dimensional data into a lower-dimensional space while preserving the maximum variance. In
- Visualize the reduced features in a 2D scatter plot to observe clusters and patterns.
Implementation:
1. Import Libraries:
import numpy as np
pca = PCA(n_components=2)
x_test_pca = pca.fit_transform(x_test_flat)
plt.show()
Output (Complete):
- Explained Variance Ratio: Displays the percentage of variance retained by the two principal
components.
Example Output:
- Feature Visualization: The scatter plot shows clusters of digits in the reduced feature space.
Similar digits (e.g., 0s and 6s) are grouped together, demonstrating effective feature extraction.
Conclusion:
- PCA effectively reduces the dimensionality of the MNIST dataset while preserving essential
features.
- The clusters observed in the 2D scatter plot indicate that digits with similar shapes are grouped
- This experiment demonstrates the power of PCA for unsupervised feature extraction and
visualization.
Future Enhancements:
- Experiment with other dimensionality reduction techniques like t-SNE or UMAP for better
visualization.
- Apply this method to other image datasets such as CIFAR-10 or Fashion MNIST.