0% found this document useful (0 votes)
4 views7 pages

Autism Withfaces

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views7 pages

Autism Withfaces

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

<h2 style="color:#FFFFFF; font-family: 'Times New Roman', Times,

serif;">Data Loading and Exploration</h2>

First we are going to import the different libraries that will conform our project and will help us
at the time of the creation of our neural network.

import numpy as np
import pandas as pd
import os
import cv2
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, models
from keras.datasets import fashion_mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import SGD
from keras.utils import to_categorical
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
import numpy as np
import numpy as np
from keras.layers import Input
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import confusion_matrix
import seaborn as sns
from sklearn.metrics import classification_report
import sys
from keras import optimizers, metrics, callbacks

In this notebook, our objective is to load the dataset containing images of both autistic and non-
autistic children. This dataset is organized into three distinct folders: train, valid, and test,
each containing a portion of the dataset.

Rather than keeping the data separated in these folders, we will consolidate them into a single
dataset. However, before combining the data, we will perform a shuffle of the images within
each folder individually. This initial step ensures randomness in our dataset, which is crucial for
effective training and evaluation of machine learning models.

train = []
for dirname, _, filenames in os.walk('/kaggle/input'):
for filename in filenames:
full_path = os.path.join(dirname, filename)
archive=os.path.join(dirname, filename)
img=cv2.imread(archive)
if img is not None:
img = cv2.resize(img, (200, 200))
relative_path = os.path.relpath(full_path,
'/kaggle/input/autism-facial-image-dataset/Facial images dataset for
Autism Detection/')
train.append([img,relative_path,archive])
else:
print(f'Failed Load Image:{archive}')

train_data = []
valid_data = []
test_data = []
for item in train:
full_path = item[0]
label = item[1].split('/')[1]
path=item[2]
is_auth=0 if label == 'autistic' else 1
train_data.append([full_path, label,path,is_auth])
np.random.shuffle(train_data)
print(f'There are {len(train_data)} images')

<h4 style="color:#FFFFFF; font-family: 'Times New Roman', Times,


serif;">Data Visualization</h4>

Let's create a sample visualization using randomly generated data obtained earlier. We'll utilize
Matplotlib to plot this data, showcasing its versatility and ease of use. Through this visualization,
we can explore patterns, trends, or distributions within the list, demonstrating the power of data
visualization in understanding and communicating insights.

def show_images(data):
num_images = min(len(data), 20)
fig, axes = plt.subplots(4, 5, figsize=(20, 15))
for i in range(num_images):
img_path = data[i][2]
label = data[i][1]
img = mpimg.imread(img_path)
row = i // 5
col = i % 5
axes[row, col].imshow(img)
axes[row, col].set_title(label)
axes[row, col].axis('off')
plt.subplots_adjust(wspace=0.2, hspace=0.3)
plt.show()

show_images(train_data)
<h2 style="color:#FFFFFF; font-family: 'Times New Roman', Times,
serif;">Exploratory data analysis</h2>

<h4 style="color:#FFFFFF; font-family: 'Times New Roman', Times,


serif;">Descriptive Stadistics</h4>

Upon examining the obtained list,we'll conduct an assessment to determine the count of
children with and without autism. This analysis will provide insights into the distribution of
autism diagnosis within the list.

Description of the partitions:


• This provides a summary of the 'Autism' column, including count, unique values, the
most frequent value, and frequency of the most frequent value.

Value counts for each partition:


• This shows the count of each unique value in the 'Autism' column, providing insight into
the distribution of data among different partitions.
df = pd.DataFrame(train_data, columns=['Location',
'Autism','TotalLocation','Autism_Count'])
print("\nDescription of Partitions:")
print(df['Autism'].describe())
print("\nValue Counts in Each Partition:")
print(df['Autism'].value_counts())
plt.figure(figsize=(8, 6))
ax = df['Autism'].value_counts().plot(kind='bar', color='skyblue')
plt.title('Distribution of Data')
plt.xlabel('Partition')
plt.ylabel('Count')
plt.xticks(rotation=0)
for i, v in enumerate(df['Autism'].value_counts()):
ax.text(i, v + 0.1, str(v), ha='center', va='bottom')
plt.show()

We have been able to verify in the previous deductions that there are an equal number of cases
for autism and non-autism. It was also determined that there are only these 2 data points for
autism and non-autism.

<h2 style="color:#FFFFFF; font-family: 'Times New Roman', Times,


serif;">Modeling and autism detection</h2>

<h4 style="color:#FFFFFF; font-family: 'Times New Roman', Times,


serif;">Image Preproccesing</h4>

We divide the values generated previously in valid test and train randomly, for this we must
select a sample proportional to the amount of data we have.
y=[]
x=[]
np.random.shuffle(train_data)
for i in range(len(train_data)):
x.append(train_data[i][0])
y.append(train_data[i][3])

y=np.array(y)
x=np.array(x)
x_train_full,x_test, y_train_full, y_test=
train_test_split(x,y,test_size=0.2)
print(f'Y has {y_train_full.shape[0]} values and X has
{x_train_full.shape[0]} values')
X_valid, X_train = x_train_full[:250] / 255., x_train_full[250:] /
255.
y_valid, y_train = y_train_full[:250], y_train_full[250:]
X_test = x_test / 255.

x_train_full.shape, y_test.shape

<h4 style="color:#FFFFFF; font-family: 'Times New Roman', Times,


serif;">Training facial recognition models</h4>

Now we are going to make the declaration of the model that we are going to use. It will consist
of:

Definition of input and output dimensions:

• input_dimension: Defines the input dimensions of the model, excluding the channel
dimension.
• dimension_output: Defines the output dimension of the model, in this case, 1 for
binary classification.

Sequential Model Definition:

• Sequential Keras model, a linear stack of layers.

Adding Layers to the Model:

• Convolutional and MaxPooling layers for feature extraction from images.


• Densely connected layers for final classification.
• Output layer with sigmoid activation for binary classification probability.

Model Summary:

• Prints the architecture of the model and the number of trainable parameters.

Model Compilation:

• Uses the Adam optimizer with a specified learning rate.


• Uses binary crossentropy loss function for binary classification problems.
• Defines metrics such as accuracy, precision, and recall to evaluate model performance
during training.
dimension_entrada = x_train_full.shape[1:4]
dimension_salida = 1
model = models.Sequential()
model.add(layers.Conv2D(32, (3,3), activation="relu",
input_shape=dimension_entrada))
model.add(layers.BatchNormalization())
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Conv2D(64, (3,3), activation="relu"))
model.add(layers.BatchNormalization())
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Conv2D(128, (3,3), activation="relu"))
model.add(layers.BatchNormalization())
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Conv2D(256, (3,3), activation="relu"))
model.add(layers.BatchNormalization())
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Conv2D(512, (3,3), activation="relu"))
model.add(layers.BatchNormalization())
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Flatten())
# model.add(layers.Dense(512, activation="relu"))
# model.add(layers.Dropout(0.5))
model.add(layers.Dense(256, activation="relu"))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(128, activation="relu"))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(64, activation="relu"))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(1, activation="sigmoid"))
model.summary()
opt = optimizers.Adam(learning_rate=0.00001)
model.compile(loss="binary_crossentropy", optimizer=opt,
metrics=["accuracy"])

We will now start with the previously launched model:

early_stopping_cb =
keras.callbacks.EarlyStopping(patience=10,restore_best_weights=True)
history =
model.fit(X_train,y_train,epochs=sys.maxsize,validation_data=(X_valid,
y_valid),callbacks=[early_stopping_cb])

We can observe the different values that take the orders assigned by us in the following graph,
we can observe that the val loss and val take similar trends as well as accuracy and val accuracy.
pd.DataFrame(history.history).plot(figsize=(8, 5))
plt.grid(True)
plt.gca().set_ylim(0, 1)
plt.show()

<h4 style="color:#FFFFFF; font-family: 'Times New Roman', Times,


serif;">Evaluating autism detection capability</h4>

Finally, we will evaluate the model once the training is finished.

evaluation=model.evaluate(X_test, y_test)
accuracy = evaluation[1]
val_loss = evaluation[0]
# val_accuracy = evaluation[3]
print("Accuracy:", accuracy)
print("Validation Loss:", val_loss)
# print("Validation Accuracy:", val_accuracy)

Here we can see in a confusion matrix the data that have been correct in the diagonal and in the
inverse diagonal the data that have been wrongly assigned by the model.

y_test_pred = model.predict(X_test)
y_test_pred_labels = y_test_pred > 0.5
y_test_true_labels = y_test
cm = confusion_matrix(y_test_true_labels, y_test_pred_labels)
sns.heatmap(cm, annot=True, fmt='d',cmap="crest",linewidths=.5)
plt.show()

The classification_report function generates a detailed report that includes various


metrics to evaluate the performance of a classification model. It compares the true labels
(y_test_true_labels) with the predicted labels (y_test_pred_labels) and computes
metrics such as precision, recall, F1-score, and support for each class. These metrics provide
insights into how well the model is performing for each class in the classification problem.

report = classification_report(y_test_true_labels, y_test_pred_labels,


digits=4)
print(report)

import matplotlib.pyplot as plt

def plot_image_with_labels(image, true_label, pred_label, ax):


ax.imshow(image, cmap='gray')
ax.axis('off')
ax.set_title(f'Real: {true_label}\nPredicted: {pred_label}',
fontsize=10)

fig, axs = plt.subplots(nrows=20, ncols=5, figsize=(12, 30))

for i in range(100):
image = x_test[i]
true_label = 'autist' if y_test_true_labels[i] == 0 else
'non_autist'
pred_label = 'autist' if y_test_pred_labels[i] == 0 else
'non_autist'
plot_image_with_labels(image, true_label, pred_label, axs[i // 5,
i % 5])

for row in axs:


for ax in row:
ax.axis('off')

plt.tight_layout()
plt.show()

You might also like