Model Training
Model Training
import tensorflow as tf
import os
train_dir = r"D:\Program Data\archive (1)\data\train"
test_dir = r"D:\Program Data\archive (1)\data\test"
train_datagen = tf.keras.preprocessing.image.ImageDataGenerator(
rescale=1./255,
validation_split=0.2,
rotation_range=20,
zoom_range=0.2,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
)
train_generator = train_datagen.flow_from_directory(
train_dir,
target_size=(224, 224),
batch_size=64,
class_mode='categorical',
seed=42,
subset='training',
interpolation='nearest',
shuffle=True
)
validation_generator = train_datagen.flow_from_directory(
train_dir,
target_size=(224, 224),
batch_size=64,
class_mode='categorical',
seed=42,
subset='validation',
interpolation='nearest',
shuffle=True
)
test_datagen = tf.keras.preprocessing.image.ImageDataGenerator(
rescale=1./255
)
test_generator = test_datagen.flow_from_directory(
test_dir,
target_size=(224, 224),
batch_size=64,
class_mode='categorical',
interpolation='nearest',
shuffle=False
)
base_model = tf.keras.applications.InceptionV3(weights='imagenet',
include_top=False, input_shape=(224, 224, 3))
x = base_model.output
x = tf.keras.layers.GlobalAveragePooling2D()(x)
x = tf.keras.layers.Dense(512, activation='relu')(x)
x = tf.keras.layers.Dropout(0.5)(x)
x = tf.keras.layers.Dense(256, activation='relu')(x)
x = tf.keras.layers.Dropout(0.5)(x)
predictions = tf.keras.layers.Dense(2, activation='softmax')(x)
model = tf.keras.Model(inputs=base_model.input, outputs=predictions)
This code trains a deep learning model to classify images of Driver using the InceptionV3
architecture. It uses data augmentation techniques to generate additional training images
with random transformations like rotations, zooms, and shifts. The model is trained on the
augmented data and evaluated on validation and testing data. Transfer learning is applied by
freezing some layers of the pre-trained model and training only the last few layers. The
model's performance is measured using accuracy and the training process can be stopped
early if the validation loss does not improve. Finally, the trained model is saved for future
use. Overall, this code demonstrates a common approach to building and training a deep
learning model for image classification tasks.
import playsound
import cv2
import os
import numpy as np
from pygame import mixer
from keras.models import load_model
import winsound
mixer.init()
sound = mixer.Sound('alarm.wav')
face = cv2.CascadeClassifier(
'haar cascade files/haarcascade_frontalface_alt.xml')
leye = cv2.CascadeClassifier(
'haar cascade files/haarcascade_lefteye_2splits.xml')
reye = cv2.CascadeClassifier(
'haar cascade files/haarcascade_righteye_2splits.xml')
model = load_model('models/cnncat2.h5')
path = os.getcwd()
cap = cv2.VideoCapture(0)
font = cv2.FONT_HERSHEY_COMPLEX_SMALL
count = 0
score = 0
thicc = 2
rpred = [99]
lpred = [99]
while (True):
ret, frame = cap.read()
height, width = frame.shape[:2]
faces = face.detectMultiScale(
gray, minNeighbors=5, scaleFactor=1.1, minSize=(25, 25))
left_eye = leye.detectMultiScale(gray)
right_eye = reye.detectMultiScale(gray)
except:
pass
if (thicc < 16):
thicc = thicc+2
else:
thicc = thicc-2
if (thicc < 2):
thicc = 2
cv2.rectangle(frame, (0, 0), (width, height), (0, 0, 255),
thicc)
cv2.imshow('frame', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
mixer.init()
sound = mixer.Sound('alarm.wav')
face = cv2.CascadeClassifier(
'haar cascade files/haarcascade_frontalface_alt.xml')
leye = cv2.CascadeClassifier(
'haar cascade files/haarcascade_lefteye_2splits.xml')
reye = cv2.CascadeClassifier(
'haar cascade files/haarcascade_righteye_2splits.xml')
model = load_model('models/cnncat2.h5')
path = os.getcwd()
cap = cv2.VideoCapture(0)
font = cv2.FONT_HERSHEY_COMPLEX_SMALL
count = 0
score = 0
thicc = 2
rpred = [99]
lpred = [99]
while (True):
ret, frame = cap.read()
height, width = frame.shape[:2]
faces = face.detectMultiScale(
gray, minNeighbors=5, scaleFactor=1.1, minSize=(25, 25))
left_eye = leye.detectMultiScale(gray)
right_eye = reye.detectMultiScale(gray)
except:
pass
if (thicc < 16):
thicc = thicc+2
else:
thicc = thicc-2
if (thicc < 2):
thicc = 2
cv2.rectangle(frame, (0, 0), (width, height), (0, 0, 255),
thicc)
cv2.imshow('frame', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
Using a pre-trained CNN model, this code develops a drowsiness detection system. It captures and
processes video using OpenCV. The pre-trained model is loaded after the necessary libraries have
been imported by the code. Then, using Har cascades, it starts up the video capture and configures
classifiers for identifying faces, left eyes, and right eyes. The code does facial and eye detection while
capturing frames from the video feed inside the main loop. It pre-processes the left and right eyes'
zones of interest for use as model input. Based on the retrieved eye pictures, the model predicts
whether the eyes are open or closed. The score, which represents the number of consecutive frames
when sleepiness was identified, is kept track of by the code. A sound from the sound library is played
as an alarm if the score rises above a certain point. In the video frame, a rectangle is also drawn
around the subject's face to denote sleepiness. When the drowsy state expires, the alarm is reset
because the score drops if the eyes are open.
os: This library provides a way to interact with the operating system.
numpy (imported as np): Numpy is a powerful library for numerical computations in Python.
pygame.mixer: Pygame is a library often used for developing games and multimedia
applications in Python.
winsound: This library is specific to Windows operating systems and provides a way to play
sounds using the Windows sound system.
This line of code is using TensorFlow's Keras API to define an optimizer object called
"opt" of the RMSprop class. RMSprop is a popular optimization algorithm commonly
used in deep learning models.
The optimizer is responsible for adjusting the weights and biases of a neural network
during the training process to minimize the loss function and improve the model's
performance.
The optimizer is initialized with a learning_rate parameter set to 0.0001. The learning
rate determines the step size at each iteration of the optimization algorithm. It
controls how much the optimizer adjusts the model's parameters in response to the
computed gradients.
By setting the learning rate to 0.0001, the code is configuring the RMSprop optimizer
to make small adjustments to the model's parameters during training, which is useful
when working with complex and high-dimensional data.
Once the optimizer object is created, it can be passed as an argument to the model's
compilation step or directly used in the training loop to optimize the model.
The training accuracy is a measure of how many training samples the model
classified correctly. It is the ratio of the number of correct predictions made
by the model to the total number of training samples. A high training
accuracy indicates that the model is performing well on the training data.
The evaluate() method calculates the loss and accuracy of the model on the provided
validation dataset and returns these values as a tuple. In this case, the returned
values are assigned to the variables val_loss and val_acc,
The validation loss measures how well the model is able to generalize to new, unseen
data. It indicates the difference between the predicted output and the actual output
on the validation data.
The validation accuracy is the ratio of the number of correct predictions made by the
model to the total number of validation samples. It measures the model's ability to
correctly classify unseen validation data.
Explanation:-
The test_generator object is a Python generator that generates batches of test data to
assess the model's performance on unseen data.
The evaluate() method calculates the loss and accuracy of the model on the provided
test dataset and returns these values as a tuple. In this case, the returned values are
assigned to the variables test_loss and test_acc
The test loss measures the difference between the predicted output and the actual
output on the test data. It indicates how well the model is performing on new,
unseen data. A low test loss suggests that the model is generalizing well and making
accurate predictions.
The test accuracy is the ratio of the number of correct predictions made by the
model to the total number of test samples. It measures the model's ability to
correctly classify unseen test data.
Test Loss:-26.015%
Test Accuracy:-91.68%
DATA VISUALISATION
plt.figure(figsize=(8, 6)) sets the size of the plot figure to have a width of 8
inches and a height of 6 inches.
plt.xlabel('Epoch') sets the label for the x-axis of the plot as "Epoch".
plt.ylabel('Accuracy') sets the label for the y-axis of the plot as "Accuracy".
Short Para
The provided lines of code create a plot to visualize the training and validation
accuracy of a deep learning model over multiple epochs. The training accuracy
and validation accuracy values are retrieved from the history object, which
contains the performance metrics recorded during the model training. The
plt.plot() function is used to plot these values as lines on the graph. Axes labels
and a title are added using plt.xlabel(), plt.ylabel(), and plt.title(). A legend is
included to differentiate between the training and validation accuracy lines.
Lastly, the plot is saved as an image file. This plot provides a visual
representation of how the model's accuracy changes throughout the training
process, helping to assess its performance and potential overfitting or
underfitting.
The provided lines of code create a plot to visualize the training loss and
validation loss of a deep learning model over multiple epochs. The
training and validation loss values are obtained from the history object.
The plt.plot() function is used to plot these values as lines on the
graph. The plot is labeled with axes and a title using plt.xlabel(),
plt.ylabel(), and plt.title(). A legend is included to distinguish
between the training and validation loss lines. This plot provides insights
into how the model's loss changes during training and helps assess its
performance and generalization ability.
The plt.ylim() function is used to set the range of the y-axis from 0.0
to 1.0, indicating the accuracy scale.
These lines of code compute and plot a confusion matrix, which is a useful tool for
evaluating the performance of a classification model.
The confusion_matrix() function takes two arrays as input: y_true, representing the
true labels, and y_pred_classes, representing the predicted labels from the model.
The resulting confusion matrix, cm, is a square matrix where each cell represents the
count or frequency of instances that were predicted in a certain class (columns) while
belonging to a certain true class (rows).
The plt.xlabel(), plt.ylabel(), and plt.title() functions are used to add labels to the
x-axis, y-axis, and the overall title of the plot, respectively. The x-axis label is set as
'Predicted label', the y-axis label as 'True label', and the title as 'Confusion Matrix'.
Overall, these lines of code calculate and plot a confusion matrix, providing a visual
representation of how well the model's predictions align with the true labels. It helps
in understanding the distribution of predictions across different classes and
identifying any patterns or misclassifications.
Overall, these lines of code generate a classification report that provides insights into the
model's performance for each class, including precision (accuracy of positive predictions),
recall (sensitivity), F1-score (harmonic mean of precision and recall), and support (the
number of instances in each class). It helps in understanding how well the model performs
for different classes and can assist in identifying potential areas of improvement.
(Receiver Operating Characteristic (ROC))The roc_curve() function takes the true
labels (y_true) and the predicted probabilities for the positive class (y_pred[:, 1]) as
inputs. It calculates the False Positive Rate (FPR), True Positive Rate (TPR), and the
corresponding thresholds for different classification thresholds.
The auc() function from the sklearn.metrics module is then used to compute the
Area Under the ROC Curve (ROC AUC) score, which quantifies the overall
performance of the model across all possible classification thresholds.
The plt.plot() function is used to plot the ROC curve using the FPR and TPR values.
The curve is shown in dark orange color, with a line width of 2. The label in the
legend displays the ROC AUC score.
The plt.plot() function is also used to plot a dashed line representing the baseline,
which corresponds to a random classifier. This line is shown in navy color.
The plt.xlim() and plt.ylim() functions set the limits of the x-axis and y-axis,
respectively, to ensure the entire curve is visible.
The plt.xlabel(), plt.ylabel(), and plt.title() functions are used to add labels to the
x-axis, y-axis, and the overall title of the plot, respectively. The x-axis label is set as
'False Positive Rate', the y-axis label as 'True Positive Rate', and the title as 'Receiver
Operating Characteristic (ROC) Curve'.
The plt.legend() function is used to display a legend in the plot, positioned in the
lower-right corner.
Overall, these lines of code calculate the ROC curve and ROC AUC score, and then
plot the curve to visualize the trade-off between true positive rate and false positive
rate for different classification thresholds. The plot helps assess the model's
performance in distinguishing between the positive and negative classes.
Data Preparation
import the required libraries/modules: os, shutil, glob, and tqdm, for
performing various file and directory operations, including
copying/moving files, searching for files, and displaying progress bars
during the execution of a loop.
1. Raw_DIR is a variable storing the path to the source directory where
the image files are located.
2. The os.walk() function is used to iterate through all the files and
subdirectories in the Raw_DIR directory.
3. Within the loop, the tqdm() function is used to create a progress
bar for the file processing.
4. The list comprehension [f for f in filenames if
f.endswith('.png')] filters out only the files with a .png
extension.
5. For each file (i) that ends with .png, it checks the value of the fifth
element after splitting the filename by underscore (_).
6. If the fifth element is '0', the file is copied to the destination
directory Close Eyes.
7. If the fifth element is '1', the file is copied to the destination
directory Explain in short.
To summarize, the code processes all the PNG image files in the
specified directory and copies them to different destinations based on
the value of the fifth element in the filename.