0% found this document useful (0 votes)
20 views

Model Training

This code implements a drowsiness detection system using a pre-trained CNN model. It loads the model and initializes OpenCV and other libraries. It then enters a main loop where it captures video frames, detects faces and eyes using Haar cascades, extracts eye images, inputs them to the model for prediction, and tracks a "score" to determine if drowsiness is detected based on consecutive eye closure frames. If the score exceeds a threshold, an alarm sound is played to alert the driver. Overall, it demonstrates a real-time drowsiness detection solution using computer vision and deep learning techniques.

Uploaded by

Ananya Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Model Training

This code implements a drowsiness detection system using a pre-trained CNN model. It loads the model and initializes OpenCV and other libraries. It then enters a main loop where it captures video frames, detects faces and eyes using Haar cascades, extracts eye images, inputs them to the model for prediction, and tracks a "score" to determine if drowsiness is detected based on consecutive eye closure frames. If the score exceeds a threshold, an alarm sound is played to alert the driver. Overall, it demonstrates a real-time drowsiness detection solution using computer vision and deep learning techniques.

Uploaded by

Ananya Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Model Training

import tensorflow as tf
import os
train_dir = r"D:\Program Data\archive (1)\data\train"
test_dir = r"D:\Program Data\archive (1)\data\test"
train_datagen = tf.keras.preprocessing.image.ImageDataGenerator(
rescale=1./255,
validation_split=0.2,
rotation_range=20,
zoom_range=0.2,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
)
train_generator = train_datagen.flow_from_directory(
train_dir,
target_size=(224, 224),
batch_size=64,
class_mode='categorical',
seed=42,
subset='training',
interpolation='nearest',
shuffle=True
)
validation_generator = train_datagen.flow_from_directory(
train_dir,
target_size=(224, 224),
batch_size=64,
class_mode='categorical',
seed=42,
subset='validation',
interpolation='nearest',
shuffle=True
)

test_datagen = tf.keras.preprocessing.image.ImageDataGenerator(
rescale=1./255
)
test_generator = test_datagen.flow_from_directory(
test_dir,
target_size=(224, 224),
batch_size=64,
class_mode='categorical',
interpolation='nearest',
shuffle=False
)
base_model = tf.keras.applications.InceptionV3(weights='imagenet',
include_top=False, input_shape=(224, 224, 3))
x = base_model.output
x = tf.keras.layers.GlobalAveragePooling2D()(x)
x = tf.keras.layers.Dense(512, activation='relu')(x)
x = tf.keras.layers.Dropout(0.5)(x)
x = tf.keras.layers.Dense(256, activation='relu')(x)
x = tf.keras.layers.Dropout(0.5)(x)
predictions = tf.keras.layers.Dense(2, activation='softmax')(x)
model = tf.keras.Model(inputs=base_model.input, outputs=predictions)

for layer in base_model.layers[:-10]:


layer.trainable = False
for layer in base_model.layers[-10:]:
layer.trainable = True
model.summary()
opt = tf.keras.optimizers.Nadam(learning_rate=0.0001)
model.compile(loss='categorical_crossentropy', optimizer=opt,
metrics=['accuracy'])
history = model.fit(
train_generator,
epochs=8,
validation_data=validation_generator,
callbacks=[tf.keras.callbacks.EarlyStopping(monitor='val_loss',
patience=5)]
)
train_loss, train_acc = model.evaluate(train_generator)
print("Training loss:", train_loss)
print("Training accuracy:", train_acc)
val_loss, val_acc = model.evaluate(validation_generator)
print("Validation loss:", val_loss)
print("Validation accuracy:", val_acc)
test_loss, test_acc = model.evaluate(test_generator)
print("Test loss:", test_loss)
print("Test accuracy:", test_acc)
model.save('DDD.h5')

This code trains a deep learning model to classify images of Driver using the InceptionV3
architecture. It uses data augmentation techniques to generate additional training images
with random transformations like rotations, zooms, and shifts. The model is trained on the
augmented data and evaluated on validation and testing data. Transfer learning is applied by
freezing some layers of the pre-trained model and training only the last few layers. The
model's performance is measured using accuracy and the training process can be stopped
early if the validation loss does not improve. Finally, the trained model is saved for future
use. Overall, this code demonstrates a common approach to building and training a deep
learning model for image classification tasks.

import playsound
import cv2
import os
import numpy as np
from pygame import mixer
from keras.models import load_model
import winsound

mixer.init()
sound = mixer.Sound('alarm.wav')

face = cv2.CascadeClassifier(
'haar cascade files/haarcascade_frontalface_alt.xml')
leye = cv2.CascadeClassifier(
'haar cascade files/haarcascade_lefteye_2splits.xml')
reye = cv2.CascadeClassifier(
'haar cascade files/haarcascade_righteye_2splits.xml')

lbl = ['Close', 'Open']

model = load_model('models/cnncat2.h5')
path = os.getcwd()
cap = cv2.VideoCapture(0)
font = cv2.FONT_HERSHEY_COMPLEX_SMALL
count = 0
score = 0
thicc = 2
rpred = [99]
lpred = [99]

while (True):
ret, frame = cap.read()
height, width = frame.shape[:2]

gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

faces = face.detectMultiScale(
gray, minNeighbors=5, scaleFactor=1.1, minSize=(25, 25))
left_eye = leye.detectMultiScale(gray)
right_eye = reye.detectMultiScale(gray)

cv2.rectangle(frame, (0, height-50), (200, height),


(0, 0, 0), thickness=cv2.FILLED)

for (x, y, w, h) in faces:


cv2.rectangle(frame, (x, y), (x+w, y+h), (100, 100, 100), 1)

for (x, y, w, h) in right_eye:


r_eye = frame[y:y+h, x:x+w]
count = count+1
r_eye = cv2.cvtColor(r_eye, cv2.COLOR_BGR2GRAY)
r_eye = cv2.resize(r_eye, (24, 24))
r_eye = r_eye/255.0
r_eye = r_eye.reshape(24, 24, -1)
r_eye = np.expand_dims(r_eye, axis=0)
rpred = model.predict(r_eye)
print(rpred)
if (rpred[0][0] > rpred[0][1]):
lbl = ''
else:
lbl = ''

for (x, y, w, h) in left_eye:


l_eye = frame[y:y+h, x:x+w]
count = count+1
l_eye = cv2.cvtColor(l_eye, cv2.COLOR_BGR2GRAY)
l_eye = cv2.resize(l_eye, (24, 24))
l_eye = l_eye/255.0
l_eye = l_eye.reshape(24, 24, -1)
l_eye = np.expand_dims(l_eye, axis=0)
lpred = model.predict(l_eye)
if (lpred[0][0] > lpred[0][1]):
lbl = ''
else:
lbl = ''

if (rpred[0][0] > rpred[0][1] and lpred[0][0] > lpred[0][1]):


score = score+1
cv2.putText(frame, '', (10, height-20), font,
1, (255, 255, 255), 1, cv2.LINE_AA)
else:
score = 0
cv2.putText(frame, 'Open', (10, height-20), font,
1, (255, 255, 255), 1, cv2.LINE_AA)

if (score < 0):


score = 0
cv2.putText(frame, 'Score:'+str(score), (100, height-20),
font, 1, (255, 255, 255), 1, cv2.LINE_AA)
if (score > 3):
cv2.imwrite(os.path.join(path, 'image.jpg'), frame)
try:
winsound.Beep(1000, 1000)

except:
pass
if (thicc < 16):
thicc = thicc+2
else:
thicc = thicc-2
if (thicc < 2):
thicc = 2
cv2.rectangle(frame, (0, 0), (width, height), (0, 0, 255),
thicc)
cv2.imshow('frame', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()

mixer.init()
sound = mixer.Sound('alarm.wav')

face = cv2.CascadeClassifier(
'haar cascade files/haarcascade_frontalface_alt.xml')
leye = cv2.CascadeClassifier(
'haar cascade files/haarcascade_lefteye_2splits.xml')
reye = cv2.CascadeClassifier(
'haar cascade files/haarcascade_righteye_2splits.xml')

lbl = ['Close', 'Open']

model = load_model('models/cnncat2.h5')
path = os.getcwd()
cap = cv2.VideoCapture(0)
font = cv2.FONT_HERSHEY_COMPLEX_SMALL
count = 0
score = 0
thicc = 2
rpred = [99]
lpred = [99]

while (True):
ret, frame = cap.read()
height, width = frame.shape[:2]

gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

faces = face.detectMultiScale(
gray, minNeighbors=5, scaleFactor=1.1, minSize=(25, 25))
left_eye = leye.detectMultiScale(gray)
right_eye = reye.detectMultiScale(gray)

cv2.rectangle(frame, (0, height-50), (200, height),


(0, 0, 0), thickness=cv2.FILLED)

for (x, y, w, h) in faces:


cv2.rectangle(frame, (x, y), (x+w, y+h), (100, 100, 100), 1)

for (x, y, w, h) in right_eye:


r_eye = frame[y:y+h, x:x+w]
count = count+1
r_eye = cv2.cvtColor(r_eye, cv2.COLOR_BGR2GRAY)
r_eye = cv2.resize(r_eye, (24, 24))
r_eye = r_eye/255.0
r_eye = r_eye.reshape(24, 24, -1)
r_eye = np.expand_dims(r_eye, axis=0)
rpred = model.predict(r_eye)
print(rpred)
if (rpred[0][0] > rpred[0][1]):
lbl = ''
else:
lbl = ''

for (x, y, w, h) in left_eye:


l_eye = frame[y:y+h, x:x+w]
count = count+1
l_eye = cv2.cvtColor(l_eye, cv2.COLOR_BGR2GRAY)
l_eye = cv2.resize(l_eye, (24, 24))
l_eye = l_eye/255.0
l_eye = l_eye.reshape(24, 24, -1)
l_eye = np.expand_dims(l_eye, axis=0)
lpred = model.predict(l_eye)
if (lpred[0][0] > lpred[0][1]):
lbl = ''
else:
lbl = ''

if (rpred[0][0] > rpred[0][1] and lpred[0][0] > lpred[0][1]):


score = score+1
cv2.putText(frame, '', (10, height-20), font,
1, (255, 255, 255), 1, cv2.LINE_AA)
else:
score = 0
# score = score-1
cv2.putText(frame, '', (100, height-20), font,
1, (255, 255, 255), 1, cv2.LINE_AA)

if (score < 0):


score = 0
cv2.putText(frame, 'Score:'+str(score), (100, height-20),
font, 1, (255, 255, 255), 1, cv2.LINE_AA)
if (score > 7):
# cv2.imwrite(os.path.join(path, 'image.jpg'), frame)
try:
score = 4
playsound.playsound("alarm.wav")

except:
pass
if (thicc < 16):
thicc = thicc+2
else:
thicc = thicc-2
if (thicc < 2):
thicc = 2
cv2.rectangle(frame, (0, 0), (width, height), (0, 0, 255),
thicc)
cv2.imshow('frame', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()

Drowsiness main file:-

Using a pre-trained CNN model, this code develops a drowsiness detection system. It captures and
processes video using OpenCV. The pre-trained model is loaded after the necessary libraries have
been imported by the code. Then, using Har cascades, it starts up the video capture and configures
classifiers for identifying faces, left eyes, and right eyes. The code does facial and eye detection while
capturing frames from the video feed inside the main loop. It pre-processes the left and right eyes'
zones of interest for use as model input. Based on the retrieved eye pictures, the model predicts
whether the eyes are open or closed. The score, which represents the number of consecutive frames
when sleepiness was identified, is kept track of by the code. A sound from the sound library is played
as an alarm if the score rises above a certain point. In the video frame, a rectangle is also drawn
around the subject's face to denote sleepiness. When the drowsy state expires, the alarm is reset
because the score drops if the eyes are open.

os: This library provides a way to interact with the operating system.

numpy (imported as np): Numpy is a powerful library for numerical computations in Python.

pygame.mixer: Pygame is a library often used for developing games and multimedia
applications in Python.

keras.models.load_model: Keras is a high-level deep learning framework that simplifies the


process of building and training neural network model.

winsound: This library is specific to Windows operating systems and provides a way to play
sounds using the Windows sound system.

opt = tf.keras.optimizers.RMSprop(learning_rate=0.0001) Explaination

This line of code is using TensorFlow's Keras API to define an optimizer object called
"opt" of the RMSprop class. RMSprop is a popular optimization algorithm commonly
used in deep learning models.

The optimizer is responsible for adjusting the weights and biases of a neural network
during the training process to minimize the loss function and improve the model's
performance.

The optimizer is initialized with a learning_rate parameter set to 0.0001. The learning
rate determines the step size at each iteration of the optimization algorithm. It
controls how much the optimizer adjusts the model's parameters in response to the
computed gradients.

By setting the learning rate to 0.0001, the code is configuring the RMSprop optimizer
to make small adjustments to the model's parameters during training, which is useful
when working with complex and high-dimensional data.

Once the optimizer object is created, it can be passed as an argument to the model's
compilation step or directly used in the training loop to optimize the model.

train_loss, train_acc = model.evaluate(train_generator) print("Training loss:",


train_loss) print("Training accuracy:", train_acc) Explanation
train_loss, train_acc = model.evaluate(train_generator): This line uses
the evaluate method of the model object to compute the loss and accuracy
of the model on the training dataset.
train_generator represents the training data generator, which generates
batches of training samples.
The evaluate method calculates the loss and accuracy by forwarding the
training samples through the model and comparing the predicted outputs
with the ground truth labels.
print("Training loss:", train_loss): This line prints the training loss is a
measure of how well the model is able to minimize the difference between
the predicted output and the actual output on the training data

The training accuracy is a measure of how many training samples the model
classified correctly. It is the ratio of the number of correct predictions made
by the model to the total number of training samples. A high training
accuracy indicates that the model is performing well on the training data.

Training Loss:- 93.11%


Training accuracy:- 96.99%

val_loss, val_acc = model.evaluate(validation_generator) print("Validation loss:", val_loss)


print("Validation accuracy:", val_acc

code evaluate the performance of a trained deep learning model on a validation


dataset using the evaluate() method of the Keras Model object.

The validation_generator object is a Python generator that generates batches of


validation data to evaluate the model's performance on.

The evaluate() method calculates the loss and accuracy of the model on the provided
validation dataset and returns these values as a tuple. In this case, the returned
values are assigned to the variables val_loss and val_acc,

The validation loss measures how well the model is able to generalize to new, unseen
data. It indicates the difference between the predicted output and the actual output
on the validation data.

The validation accuracy is the ratio of the number of correct predictions made by the
model to the total number of validation samples. It measures the model's ability to
correctly classify unseen validation data.

Validation Loss:- 20.016%


Validation Accuracy:- 94.129%

test_loss, test_acc = model.evaluate(test_generator)


print("Test loss:", test_loss)
print("Test accuracy:", test_acc)

Explanation:-

The test_generator object is a Python generator that generates batches of test data to
assess the model's performance on unseen data.
The evaluate() method calculates the loss and accuracy of the model on the provided
test dataset and returns these values as a tuple. In this case, the returned values are
assigned to the variables test_loss and test_acc

The test loss measures the difference between the predicted output and the actual
output on the test data. It indicates how well the model is performing on new,
unseen data. A low test loss suggests that the model is generalizing well and making
accurate predictions.

The test accuracy is the ratio of the number of correct predictions made by the
model to the total number of test samples. It measures the model's ability to
correctly classify unseen test data.

Test Loss:-26.015%
Test Accuracy:-91.68%

DATA VISUALISATION

plt.figure(figsize=(8, 6)) sets the size of the plot figure to have a width of 8
inches and a height of 6 inches.

plt.plot(history.history['accuracy'], label='Training accuracy') plots the


training accuracy values stored in the history object, which is typically
obtained after training a deep learning model. It creates a line plot of the
training accuracy values over the epochs.
plt.plot(history.history['val_accuracy'], label='Validation accuracy') plots the
validation accuracy values stored in the history object. It creates a line plot of
the validation accuracy values over the epochs.

plt.xlabel('Epoch') sets the label for the x-axis of the plot as "Epoch".

plt.ylabel('Accuracy') sets the label for the y-axis of the plot as "Accuracy".

plt.title('Training and Validation Accuracy') sets the title of the plot as


"Training and Validation Accuracy".

plt.legend(loc='lower right') displays a legend in the lower-right corner of the


plot, differentiating between the training and validation accuracy lines.

plt.savefig('/content/plots/training_validation_accuracy.png') saves the plot


as an image file with the specified path and filename.

Short Para
The provided lines of code create a plot to visualize the training and validation
accuracy of a deep learning model over multiple epochs. The training accuracy
and validation accuracy values are retrieved from the history object, which
contains the performance metrics recorded during the model training. The
plt.plot() function is used to plot these values as lines on the graph. Axes labels
and a title are added using plt.xlabel(), plt.ylabel(), and plt.title(). A legend is
included to differentiate between the training and validation accuracy lines.
Lastly, the plot is saved as an image file. This plot provides a visual
representation of how the model's accuracy changes throughout the training
process, helping to assess its performance and potential overfitting or
underfitting.
The provided lines of code create a plot to visualize the training loss and
validation loss of a deep learning model over multiple epochs. The
training and validation loss values are obtained from the history object.
The plt.plot() function is used to plot these values as lines on the
graph. The plot is labeled with axes and a title using plt.xlabel(),
plt.ylabel(), and plt.title(). A legend is included to distinguish
between the training and validation loss lines. This plot provides insights
into how the model's loss changes during training and helps assess its
performance and generalization ability.

The plt.figure() function is used to specify the size of the plot.


The plt.bar() function is used to create a bar chart with a single bar
representing the testing accuracy. The labels for the x-axis are set to
'Test', indicating that this bar represents the testing dataset. The height
of the bar is set to the value of test_acc, which represents the accuracy
of the model on the testing dataset.

The plt.ylim() function is used to set the range of the y-axis from 0.0
to 1.0, indicating the accuracy scale.

The plt.xlabel(), plt.ylabel(), and plt.title() functions are used to


add labels to the x-axis, y-axis, and the overall title of the plot,
respectively. In this case, the x-axis is labeled as 'Dataset', the y-axis as
'Accuracy', and the title as 'Testing Accuracy'.

These lines of code compute and plot a confusion matrix, which is a useful tool for
evaluating the performance of a classification model.

The confusion_matrix() function takes two arrays as input: y_true, representing the
true labels, and y_pred_classes, representing the predicted labels from the model.

The resulting confusion matrix, cm, is a square matrix where each cell represents the
count or frequency of instances that were predicted in a certain class (columns) while
belonging to a certain true class (rows).

The plt.figure() function is used to specify the size of the plot.


The sns.heatmap() function from the Seaborn library is used to create a heatmap
visualization of the confusion matrix. The annot=True parameter displays the count
values within each cell. The cmap=plt.cm.Blues parameter sets the color map to shades
of blue. The fmt='g' parameter formats the count values in a general numeric format.

The plt.xlabel(), plt.ylabel(), and plt.title() functions are used to add labels to the
x-axis, y-axis, and the overall title of the plot, respectively. The x-axis label is set as
'Predicted label', the y-axis label as 'True label', and the title as 'Confusion Matrix'.

Overall, these lines of code calculate and plot a confusion matrix, providing a visual
representation of how well the model's predictions align with the true labels. It helps
in understanding the distribution of predictions across different classes and
identifying any patterns or misclassifications.

Overall, these lines of code generate a classification report that provides insights into the
model's performance for each class, including precision (accuracy of positive predictions),
recall (sensitivity), F1-score (harmonic mean of precision and recall), and support (the
number of instances in each class). It helps in understanding how well the model performs
for different classes and can assist in identifying potential areas of improvement.
(Receiver Operating Characteristic (ROC))The roc_curve() function takes the true
labels (y_true) and the predicted probabilities for the positive class (y_pred[:, 1]) as
inputs. It calculates the False Positive Rate (FPR), True Positive Rate (TPR), and the
corresponding thresholds for different classification thresholds.

The auc() function from the sklearn.metrics module is then used to compute the
Area Under the ROC Curve (ROC AUC) score, which quantifies the overall
performance of the model across all possible classification thresholds.

The plt.figure() function is used to specify the size of the plot.

The plt.plot() function is used to plot the ROC curve using the FPR and TPR values.
The curve is shown in dark orange color, with a line width of 2. The label in the
legend displays the ROC AUC score.

The plt.plot() function is also used to plot a dashed line representing the baseline,
which corresponds to a random classifier. This line is shown in navy color.

The plt.xlim() and plt.ylim() functions set the limits of the x-axis and y-axis,
respectively, to ensure the entire curve is visible.

The plt.xlabel(), plt.ylabel(), and plt.title() functions are used to add labels to the
x-axis, y-axis, and the overall title of the plot, respectively. The x-axis label is set as
'False Positive Rate', the y-axis label as 'True Positive Rate', and the title as 'Receiver
Operating Characteristic (ROC) Curve'.

The plt.legend() function is used to display a legend in the plot, positioned in the
lower-right corner.

Overall, these lines of code calculate the ROC curve and ROC AUC score, and then
plot the curve to visualize the trade-off between true positive rate and false positive
rate for different classification thresholds. The plot helps assess the model's
performance in distinguishing between the positive and negative classes.

Data Preparation

import the required libraries/modules: os, shutil, glob, and tqdm, for
performing various file and directory operations, including
copying/moving files, searching for files, and displaying progress bars
during the execution of a loop.
1. Raw_DIR is a variable storing the path to the source directory where
the image files are located.
2. The os.walk() function is used to iterate through all the files and
subdirectories in the Raw_DIR directory.
3. Within the loop, the tqdm() function is used to create a progress
bar for the file processing.
4. The list comprehension [f for f in filenames if
f.endswith('.png')] filters out only the files with a .png
extension.
5. For each file (i) that ends with .png, it checks the value of the fifth
element after splitting the filename by underscore (_).
6. If the fifth element is '0', the file is copied to the destination
directory Close Eyes.
7. If the fifth element is '1', the file is copied to the destination
directory Explain in short.

To summarize, the code processes all the PNG image files in the
specified directory and copies them to different destinations based on
the value of the fifth element in the filename.

You might also like