0% found this document useful (0 votes)
7 views8 pages

Chetan Abbireddy 23WU0201049 Applied Analytics Using Python

The document presents an assignment on applied analytics using Python, focusing on image recognition with the CIFAR-10 dataset. It compares two approaches: one using HOG features with an SVM classifier, achieving 43.40% accuracy, and another optimized method using a CNN that improves accuracy to 95.60%. The conclusion highlights the superiority of CNNs in image recognition due to their ability to process raw pixel data and utilize deep learning techniques.

Uploaded by

keswanideepesh9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views8 pages

Chetan Abbireddy 23WU0201049 Applied Analytics Using Python

The document presents an assignment on applied analytics using Python, focusing on image recognition with the CIFAR-10 dataset. It compares two approaches: one using HOG features with an SVM classifier, achieving 43.40% accuracy, and another optimized method using a CNN that improves accuracy to 95.60%. The conclusion highlights the superiority of CNNs in image recognition due to their ability to process raw pixel data and utilize deep learning techniques.

Uploaded by

keswanideepesh9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Applied Analytics using Python

Assignment

Submitted by: Chetan Abbireddy


Section: Tigers DSAI
Date: 03-03-2025

1
Provided Code:
import numpy as np
import cv2
from skimage.feature import hog
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
from tensorflow.keras.datasets import cifar10
import matplotlib.pyplot as plt

# Load CIFAR-10 dataset


(X_train, y_train), (X_test, y_test) = cifar10.load_data()

# Reduce dataset size for faster execution


X_train, y_train = X_train[:5000], y_train[:5000]
X_test, y_test = X_test[:1000], y_test[:1000]

# CIFAR-10 class names


class_names = [
"Airplane", "Automobile", "Bird", "Cat", "Deer", "Dog", "Frog", "Horse", "Ship", "Truck"
]

# Extract HOG features


def extract_features(image):
image = cv2.resize(image, (64, 64)) # Resize for consistency
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY) # Convert to grayscale
features, _ = hog(
gray, orientations=9, pixels_per_cell=(8, 8), cells_per_block=(2, 2), visualize=True
)
return features

2
# Convert all images to HOG features
X_train_hog = np.array([extract_features(img) for img in X_train])
X_test_hog = np.array([extract_features(img) for img in X_test])

# Train SVM classifier


svm = SVC(kernel='linear')
svm.fit(X_train_hog, y_train.ravel())

# Predict and evaluate


y_pred = svm.predict(X_test_hog)
accuracy = accuracy_score(y_test, y_pred)
print(f"Model Accuracy: {accuracy * 100:.2f}%")

# Test with a random image


def predict_image(index):
image = X_test[index]
features = extract_features(image).reshape(1, -1)
prediction = svm.predict(features)

# Display image
plt.imshow(image)
plt.title(f"Predicted: {class_names[prediction[0]]}")
plt.axis('off')
plt.show()

# Example usage
predict_image(5) # Change index to test different images

3
Original Output:

4
Optimised Code using CNN:
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.utils import to_categorical

# Load CIFAR-10 dataset


(X_train, y_train), (X_test, y_test) = cifar10.load_data()

# Flatten labels
y_train = y_train.flatten()
y_test = y_test.flatten()

# Select only two classes: Bird (1) and Automobile (2)


selected_classes = [1, 2]
train_mask = np.isin(y_train, selected_classes)
test_mask = np.isin(y_test, selected_classes)

X_train, y_train = X_train[train_mask], y_train[train_mask]


X_test, y_test = X_test[test_mask], y_test[test_mask]

# Convert labels: Bird (1) -> 0, Automobile (2) -> 1


y_train = (y_train == 2).astype(int)
y_test = (y_test == 2).astype(int)

# Normalize images (scale pixel values between 0 and 1)

5
X_train = X_train.astype("float32") / 255.0
X_test = X_test.astype("float32") / 255.0

# Convert labels to categorical (one-hot encoding)


y_train = to_categorical(y_train, 2)
y_test = to_categorical(y_test, 2)

# Build the CNN model


model = Sequential([
Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)),
MaxPooling2D((2, 2)),
Conv2D(64, (3, 3), activation='relu'),
MaxPooling2D((2, 2)),
Conv2D(128, (3, 3), activation='relu'),
MaxPooling2D((2, 2)),
Flatten(),
Dense(128, activation='relu'),
Dropout(0.5),
Dense(2, activation='softmax')
])

# Compile the model


model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Train the model


model.fit(X_train, y_train, epochs=10, batch_size=32, validation_data=(X_test, y_test))

# Evaluate the model


test_loss, test_acc = model.evaluate(X_test, y_test)
print(f"Test Accuracy: {test_acc * 100:.2f}%")

6
# Function to predict and display image
def predict_image(index):
image = X_test[index]
prediction = model.predict(image.reshape(1, 32, 32, 3))
predicted_label = "Bird" if np.argmax(prediction) == 1 else "Automobile"

# Display the image


plt.imshow(image)
plt.title(f"Predicted: {predicted_label}")
plt.axis("off")
plt.show()

# Test with an example image


predict_image(2) # Change index to test different images

Optimised Output:

7
Conclusion:
There was a significant improvement in image recognition from the original being 43.40% to
the optimised and updated output which is 95.60% accurate. The CNN (Convolutional Neural
Network) is much more accurate than the HOG (Histogram of Oriented Gradients )and SVM
(Support Vector Machine) because it uses the raw pixels from the images rather than the
extracted HOG features. Another reason CNN was more accurate was because it used a deep
learning model, which although uses more computing power, will give a much more accurate
result.

You might also like