100% found this document useful (1 vote)

66 views8 pages

PCa $ Image Processing

The document outlines a Python code implementation for handwritten digit classification using the MNIST dataset, employing Principal Component Analysis (PCA) for dimensionality reduction and Logistic Regression for classification. It includes steps for importing libraries, loading and visualizing the dataset, reducing dimensions, splitting data, training the model, and evaluating accuracy, achieving around 91-92% accuracy. Additionally, it describes image preprocessing techniques to convert custom images into the MNIST format for potential machine learning applications.

Uploaded by

dagiabi51

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

66 views8 pages

PCa $ Image Processing

Uploaded by

dagiabi51

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

PCA

from sklearn.datasets import fetch_openml

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LogisticRegression

from sklearn.metrics import accuracy_score

from sklearn.decomposition import PCA

import matplotlib.pyplot as plt

data = fetch_openml("mnist_784", version=1)

x, y = data.data, data.target

plt.imshow(x.iloc[50].values.reshape(28, 28), cmap="gray")

plt.show()

pca = PCA(n_components=50)

x_reduced = pca.fit_transform(x)

x_train, x_test, y_train, y_test = train_test_split(x_reduced, y, random_state=42,

test_size=0.2)

lg = LogisticRegression(max_iter=1000, solver="lbfgs", multi_class="multinomial")

lg.fit(x_train, y_train)

predicted = lg.predict(x_test)

accuracy = accuracy_score(y_test, predicted)

print(accuracy)

This code demonstrates how to perform handwritten digit classification

using the MNIST dataset with Principal Component Analysis (PCA) for
dimensionality reduction and Logistic Regression for classification. Here's
a detailed breakdown:

1. Importing Libraries
python
Copy
from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt
 fetch_openml: Used to download datasets from OpenML
 train_test_split: For splitting data into training and test sets
 LogisticRegression: The classification model we'll use
 accuracy_score: To evaluate model performance
 PCA: For dimensionality reduction
 matplotlib.pyplot: For visualizing the digits

2. Loading the MNIST Dataset

python
Copy
data = fetch_openml("mnist_784", version=1)
x, y = data.data, data.target
 MNIST contains 70,000 handwritten digit images (0-9)
 Each image is 28×28 pixels, flattened into a 784-dimensional vector
(features)
 x contains the pixel data, y contains the digit labels (0-9)

3. Visualizing a Sample Digit

python
Copy
plt.imshow(x.iloc[50].values.reshape(28, 28), cmap="gray")
plt.show()
 Selects the 50th image from the dataset
 Reshapes the 784-length vector back to 28×28
 Displays it in grayscale
 This helps verify the data is loaded correctly

4. Dimensionality Reduction with PCA

python
Copy
pca = PCA(n_components=50)
x_reduced = pca.fit_transform(x)
 PCA reduces the 784 dimensions to just 50 principal components
 This speeds up training while preserving most of the variance
 The reduced dataset (x_reduced) now has 50 features per image
instead of 784

5. Splitting Data into Train/Test Sets

python
Copy
x_train, x_test, y_train, y_test = train_test_split(x_reduced, y, random_state=42,
test_size=0.2)
 80% of data (56,000 samples) for training
 20% of data (14,000 samples) for testing
 random_state=42 ensures reproducible splits

6. Training Logistic Regression Model

python
Copy
lg = LogisticRegression(max_iter=1000, solver="lbfgs", multi_class="multinomial")
lg.fit(x_train, y_train)
 Uses multinomial logistic regression (for multi-class classification)
 max_iter=1000: Allows up to 1000 iterations for convergence
 solver="lbfgs": A good choice for medium-sized datasets
 multi_class="multinomial": Properly handles the 10-class problem

7. Making Predictions and Evaluating

Accuracy
python
Copy
predicted = lg.predict(x_test)
accuracy = accuracy_score(y_test, predicted)
print(accuracy)
 Predicts labels for the test set
 Compares predictions to true labels
 Prints the accuracy score (fraction of correct predictions)

Expected Output
The code will:

1. Display a sample digit image

2. Print an accuracy score around 0.91-0.92 (91-92% accuracy) on the
test set

Key Concepts Illustrated

1. Dimensionality Reduction: PCA helps reduce computation time
while maintaining performance
2. Multi-class Classification: Logistic regression can handle multiple
classes
3. Model Evaluation: Using a held-out test set to measure real-world
performance
4. Image Data Handling: Working with flattened image vectors

The accuracy could potentially be improved by:

 Using more PCA components (at the cost of speed)

 Trying more complex models like neural networks
 Performing hyperparameter tuning

Image Processing Code

import pandas as pd

import numpy as np

from PIL import Image

import matplotlib.pyplot as plt

# Specify the image path

image_path = image_path = "C:/Users/GeeKs/Desktop/dagi/pictures/dagi.jpg"

# Open the image using PIL

image = Image.open(image_path)

# Resize the image to 28x28

resized_image = image.resize((28, 28))

# Convert the image to grayscale

grayscale_image = resized_image.convert("L")

# Convert to numpy array

image_array = np.array(grayscale_image)

# Create a pandas DataFrame from the 2D array

image_df = pd.DataFrame(image_array)

# Display the original and resized images

plt.title("Original Image")

plt.imshow(image)

plt.show()

plt.title("Grayscale Resized Image (28x28)")

plt.imshow(grayscale_image, cmap="gray")

plt.show()

This code demonstrates how to load, preprocess, and visualize an image

using Python's PIL (Pillow), NumPy, pandas, and matplotlib libraries. The
code prepares an image for potential machine learning applications (like
digit classification) by converting it to the MNIST dataset format (28×28
grayscale).

Full Code Breakdown

1. Importing Required Libraries
python

Copy

import pandas as pd
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
 pandas (pd): For creating DataFrames (though not strictly
necessary for this operation)

 numpy (np): For array operations and conversions

 PIL.Image: From Pillow library, for image loading and processing

 matplotlib.pyplot (plt): For image visualization

2. Specifying Image Path

python

Copy

image_path = "C:/Users/GeeKs/Desktop/dagi/pictures/dagi.jpg"
 Defines the path to the image file (note there's a duplicate
assignment)

 Uses Windows path format with forward slashes (also works with
raw strings or double backslashes)

3. Loading the Image

python

Copy

image = Image.open(image_path)
 Opens the image file using PIL's Image.open()

 Creates an Image object that can be manipulated

4. Resizing the Image

python

Copy

resized_image = image.resize((28, 28))

 Resizes the image to 28×28 pixels (standard MNIST dataset size)

 Uses bilinear interpolation by default for resizing

5. Converting to Grayscale
python

Copy

grayscale_image = resized_image.convert("L")
 Converts the color image to 8-bit grayscale (mode "L")

 Each pixel will have values 0-255 (black to white)

6. Converting to NumPy Array

python

Copy

image_array = np.array(grayscale_image)
 Converts the PIL Image object to a NumPy array
 Creates a 28×28 2D array where each element is a pixel intensity
value

7. Creating a pandas DataFrame (Optional)

python

Copy

image_df = pd.DataFrame(image_array)
 Converts the NumPy array to a pandas DataFrame

 This step might be useful if you need tabular manipulation of pixel

data

 Not strictly necessary for most image processing pipelines

8. Visualizing the Images

python

Copy

plt.title("Original Image")
plt.imshow(image)
plt.show()

plt.title("Grayscale Resized Image (28x28)")

plt.imshow(grayscale_image, cmap="gray")
plt.show()
 First block:
o Shows the original color image with a title

o Uses plt.imshow() with default colormap

 Second block:
o Shows the processed grayscale image

o Uses "gray" colormap for proper grayscale display

 Both use plt.show() to render the figures

Expected Output
When you run this code, you'll see two popup windows showing:

1. The original image in its full color and original size

2. The processed version as a 28×28 grayscale image

Key Processing Steps

1. Resizing: Standardizes the image dimensions to match common ML
datasets

2. Grayscale Conversion: Reduces color information to single-

channel intensity

3. Array Conversion: Prepares the image for numerical processing

4. Visualization: Verifies each transformation step

Potential Use Cases

This preprocessing pipeline is particularly useful for:

 Preparing custom images for MNIST-style digit classification

 Creating input for neural networks that expect 28×28 grayscale

images

 Image processing workflows that require standardized input sizes

Possible Improvements
1. Normalization: Add pixel value normalization (divide by 255)

2. Inversion: MNIST expects white digits on black background

3. Error Handling: Add try-catch for file operations

4. Binarization: Optional thresholding for black-and-white conversion

Deep Learning Manual
No ratings yet
Deep Learning Manual
44 pages
Al3502 Deep Learning For Vision Lab Manuval
No ratings yet
Al3502 Deep Learning For Vision Lab Manuval
19 pages
MLP - Week 5 - MNIST - Perceptron - Ipynb - Colaboratory
No ratings yet
MLP - Week 5 - MNIST - Perceptron - Ipynb - Colaboratory
31 pages
Project Guidelines - AIML
No ratings yet
Project Guidelines - AIML
30 pages
Here Are Common Image Preprocessing Techniques Used in Machine Learning and Deep Learning
No ratings yet
Here Are Common Image Preprocessing Techniques Used in Machine Learning and Deep Learning
7 pages
CIS 6213 Applied Machine Learning Coursework
No ratings yet
CIS 6213 Applied Machine Learning Coursework
5 pages
Srafvana
No ratings yet
Srafvana
6 pages
Performance Testing
No ratings yet
Performance Testing
15 pages
D1 - Deep Learning Workshop Session 3
No ratings yet
D1 - Deep Learning Workshop Session 3
5 pages
DLV Lab Manual Print
No ratings yet
DLV Lab Manual Print
29 pages
Image Classification Guide
No ratings yet
Image Classification Guide
2 pages
Code
No ratings yet
Code
11 pages
Lab Assignment 1 216
No ratings yet
Lab Assignment 1 216
2 pages
業務処理定義書セマンティックセグメンテーション En
No ratings yet
業務処理定義書セマンティックセグメンテーション En
9 pages
Deep Learning Project For Computer Vision With Python 2022
No ratings yet
Deep Learning Project For Computer Vision With Python 2022
297 pages
Image Datasets For Practicing Machine Learning in OpenCV
No ratings yet
Image Datasets For Practicing Machine Learning in OpenCV
9 pages
ML Guide: MNIST Digit Classification
No ratings yet
ML Guide: MNIST Digit Classification
98 pages
CV Assignment 2 Group02
No ratings yet
CV Assignment 2 Group02
12 pages
Assignment 02# - Machine Learning 2023
No ratings yet
Assignment 02# - Machine Learning 2023
8 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
62 pages
Project Report - Intro To AI
No ratings yet
Project Report - Intro To AI
40 pages
Lab05 ML
No ratings yet
Lab05 ML
7 pages
Newbie's Deep Learning Project To Recognize Handwritten Digit
No ratings yet
Newbie's Deep Learning Project To Recognize Handwritten Digit
6 pages
DL & AI - Lab Manual
No ratings yet
DL & AI - Lab Manual
33 pages
"I C U N N ": Mage Lassification Sing Eural Etworks
No ratings yet
"I C U N N ": Mage Lassification Sing Eural Etworks
15 pages
Corn Det
No ratings yet
Corn Det
2 pages
Explore The Implementation of CNNs in Python
No ratings yet
Explore The Implementation of CNNs in Python
10 pages
Recognizing Handwritten Digits With Scikit-Learn: Punam Seal
No ratings yet
Recognizing Handwritten Digits With Scikit-Learn: Punam Seal
21 pages
DNN Lab Manual for MCA Semester II
No ratings yet
DNN Lab Manual for MCA Semester II
34 pages
DL Programs
No ratings yet
DL Programs
36 pages
How To Develop A CNN For MNIST Handwritten Digit Classification
No ratings yet
How To Develop A CNN For MNIST Handwritten Digit Classification
43 pages
CNN Hyperparameters and Data Processing Guide
No ratings yet
CNN Hyperparameters and Data Processing Guide
3 pages
A 1
No ratings yet
A 1
9 pages
DL Lab-Final
No ratings yet
DL Lab-Final
22 pages
DL Exp5 22108B0055
No ratings yet
DL Exp5 22108B0055
14 pages
Deep Learning Experiments
No ratings yet
Deep Learning Experiments
42 pages
Case Study - AP23322130042
No ratings yet
Case Study - AP23322130042
7 pages
Pattern Recognition Lab
No ratings yet
Pattern Recognition Lab
24 pages
Deep Learning For Vision Lab Manual 2024
100% (1)
Deep Learning For Vision Lab Manual 2024
25 pages
DL LAB MANUAL Mugesh
No ratings yet
DL LAB MANUAL Mugesh
12 pages
Image Processing with Jupyter Lab
No ratings yet
Image Processing with Jupyter Lab
8 pages
Lab 4-Image Segmentation Using U-Net
No ratings yet
Lab 4-Image Segmentation Using U-Net
9 pages
MNIST Image Classification with CNN
No ratings yet
MNIST Image Classification with CNN
3 pages
MNIST Handwritten Digit Classification Guide
No ratings yet
MNIST Handwritten Digit Classification Guide
54 pages
Point Operations in Image Processing
No ratings yet
Point Operations in Image Processing
7 pages
Digital Image Processing Lab Report
No ratings yet
Digital Image Processing Lab Report
10 pages
Capstone Project-1
No ratings yet
Capstone Project-1
15 pages
Lab05 ML Naqash
No ratings yet
Lab05 ML Naqash
10 pages
Drashti CVML
No ratings yet
Drashti CVML
83 pages
HW46
No ratings yet
HW46
5 pages
DIP Mini Project
100% (1)
DIP Mini Project
12 pages
Image Processing
No ratings yet
Image Processing
5 pages
AI Project-1 - 21L-7744 21L-5433
No ratings yet
AI Project-1 - 21L-7744 21L-5433
5 pages
Lab Sheet Artificial Intelligence: 1. Introduction To Machine Learning: Linear Regression
No ratings yet
Lab Sheet Artificial Intelligence: 1. Introduction To Machine Learning: Linear Regression
8 pages
DL Lab 12212039
No ratings yet
DL Lab 12212039
72 pages
Deep Learning
No ratings yet
Deep Learning
5 pages
FA I - Unit5
No ratings yet
FA I - Unit5
11 pages
User Manual: Hgm9 810 Genset Parallel (With Genset) Unit
No ratings yet
User Manual: Hgm9 810 Genset Parallel (With Genset) Unit
65 pages
Python Presentation
No ratings yet
Python Presentation
13 pages
Base Paper GSM
No ratings yet
Base Paper GSM
6 pages
Data Loggers: Overview and Benefits
No ratings yet
Data Loggers: Overview and Benefits
2 pages
IMT-2020 Radio Interface Specifications
No ratings yet
IMT-2020 Radio Interface Specifications
392 pages
Transmission Control Protocol (TCP) - Part I RFC 793: National Centre For Software Technology
No ratings yet
Transmission Control Protocol (TCP) - Part I RFC 793: National Centre For Software Technology
37 pages
Cps Assignment Ans
No ratings yet
Cps Assignment Ans
4 pages
Lecture 6 Wordpress Post
No ratings yet
Lecture 6 Wordpress Post
6 pages
SQL Server - Index Tuning For Peak Performance
No ratings yet
SQL Server - Index Tuning For Peak Performance
4 pages
London Travel Guide & Facts
No ratings yet
London Travel Guide & Facts
11 pages
Design Thinking End Sem
No ratings yet
Design Thinking End Sem
6 pages
Mca Project Report
No ratings yet
Mca Project Report
35 pages
JavaScript Dialogue Boxes
No ratings yet
JavaScript Dialogue Boxes
55 pages
Java Client-Server Code Example
No ratings yet
Java Client-Server Code Example
3 pages
CP R81.20 Gaia AdminGuide
No ratings yet
CP R81.20 Gaia AdminGuide
664 pages
Sms Project Doc
No ratings yet
Sms Project Doc
24 pages
IPsec VPN Troubleshooting
No ratings yet
IPsec VPN Troubleshooting
8 pages
Cambridge Assessment International Education: Information Technology 9626/11 October/November 2018
No ratings yet
Cambridge Assessment International Education: Information Technology 9626/11 October/November 2018
12 pages
MS PowerPoint 2019-365 - Comic Book
No ratings yet
MS PowerPoint 2019-365 - Comic Book
20 pages
Yo My Gang in Alaska
No ratings yet
Yo My Gang in Alaska
20 pages
5G Vs WiFi 6
No ratings yet
5G Vs WiFi 6
14 pages
Detailed IR and NLP Answers
No ratings yet
Detailed IR and NLP Answers
3 pages
Dheeraj Yadav - Resume
No ratings yet
Dheeraj Yadav - Resume
2 pages
Unison Mosaic Hardware InstallGuide RevC
No ratings yet
Unison Mosaic Hardware InstallGuide RevC
60 pages
Optimal Pokémon Battle Strategy Using Reinforcement Learning
No ratings yet
Optimal Pokémon Battle Strategy Using Reinforcement Learning
5 pages
Graph Theory Essentials
No ratings yet
Graph Theory Essentials
54 pages
Computer Hardware Servicing 10: 1 Grading Final
No ratings yet
Computer Hardware Servicing 10: 1 Grading Final
19 pages
Allowed Head Efficiency Ver2.11
No ratings yet
Allowed Head Efficiency Ver2.11
7 pages
1.2 Memory and Storage Practice Questions
No ratings yet
1.2 Memory and Storage Practice Questions
52 pages
Data Structure and Algorithm, Spring 2023 Homework 1: P4-P6 Release and Due: 13:00:00, Tuesday, April 11, 2023
No ratings yet
Data Structure and Algorithm, Spring 2023 Homework 1: P4-P6 Release and Due: 13:00:00, Tuesday, April 11, 2023
22 pages

PCa $ Image Processing

Uploaded by

PCa $ Image Processing

Uploaded by

PCA

from sklearn.datasets import fetch_openml

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LogisticRegression

from sklearn.metrics import accuracy_score

from sklearn.decomposition import PCA

import matplotlib.pyplot as plt

data = fetch_openml("mnist_784", version=1)

plt.imshow(x.iloc[50].values.reshape(28, 28), cmap="gray")

x_train, x_test, y_train, y_test = train_test_split(x_reduced, y, random_state=42,

lg = LogisticRegression(max_iter=1000, solver="lbfgs", multi_class="multinomial")

accuracy = accuracy_score(y_test, predicted)

This code demonstrates how to perform handwritten digit classification

2. Loading the MNIST Dataset

3. Visualizing a Sample Digit

4. Dimensionality Reduction with PCA

5. Splitting Data into Train/Test Sets

6. Training Logistic Regression Model

7. Making Predictions and Evaluating

1. Display a sample digit image

Key Concepts Illustrated

The accuracy could potentially be improved by:

 Using more PCA components (at the cost of speed)

Image Processing Code

from PIL import Image

import matplotlib.pyplot as plt

# Specify the image path

image_path = image_path = "C:/Users/GeeKs/Desktop/dagi/pictures/dagi.jpg"

# Open the image using PIL

# Resize the image to 28x28

resized_image = image.resize((28, 28))

# Convert the image to grayscale

# Convert to numpy array

# Create a pandas DataFrame from the 2D array

# Display the original and resized images

plt.title("Grayscale Resized Image (28x28)")

This code demonstrates how to load, preprocess, and visualize an image

Full Code Breakdown

 numpy (np): For array operations and conversions

 PIL.Image: From Pillow library, for image loading and processing

2. Specifying Image Path

3. Loading the Image

 Creates an Image object that can be manipulated

4. Resizing the Image

resized_image = image.resize((28, 28))

 Uses bilinear interpolation by default for resizing

 Each pixel will have values 0-255 (black to white)

6. Converting to NumPy Array

7. Creating a pandas DataFrame (Optional)

 This step might be useful if you need tabular manipulation of pixel

 Not strictly necessary for most image processing pipelines

8. Visualizing the Images

plt.title("Grayscale Resized Image (28x28)")

o Uses plt.imshow() with default colormap

o Uses "gray" colormap for proper grayscale display

 Both use plt.show() to render the figures

1. The original image in its full color and original size

2. The processed version as a 28×28 grayscale image

Key Processing Steps

2. Grayscale Conversion: Reduces color information to single-

3. Array Conversion: Prepares the image for numerical processing

4. Visualization: Verifies each transformation step

Potential Use Cases

 Preparing custom images for MNIST-style digit classification

 Creating input for neural networks that expect 28×28 grayscale

 Image processing workflows that require standardized input sizes

2. Inversion: MNIST expects white digits on black background

3. Error Handling: Add try-catch for file operations

4. Binarization: Optional thresholding for black-and-white conversion

You might also like