Deep Learning TensorFlow and Keras
Deep Learning TensorFlow and Keras
Sigmoid function
Gradient descent
What is TensorFlow?
Why TensorFlow?
TensorFlow basics
Tensors
Variables
Automatic differentiation
Data pre-processing
Data transformation
How to stop model training at the right time with early stopping
Final thoughts
Convolution
Padding
Apply ReLU
Pooling
Dropout regularization
Flattening
Full connection
Activation function
Data preprocessing
Model definition
Generate a tf.data.Dataset
Image augmentation
Model definition
Model evaluation
CNN architectures
Final thoughts
Weaknesses of RNNs
Applications of LSTM
Bidirectional LSTM
Imports
Data pre-processing
Imports
Load dataset
Data cleaning
Label exploration
Text vectorization
Final thoughts
Google’s Word2vec
Fasttext
Hugging Face
Prediction
Feature extraction
Fine-tuning
Data pre-processing
Data pre-processing
Final thoughts
TensorBoard
PIP installation
Conda installation
Docker installation
TensorBoard dashboards
TensorBoard scalars
TensorBoard images
TensorBoard graphs
TensorBoard distributions
TensorBoard histograms
Fairness indicators
TensorFlow Profiler
Overview page
Deep learning with TensorFlow and Keras
Trace viewer
TensorFlow stats
TensorBoard in PyTorch
TensorBoard in Keras
TensorBoard in XGBoost
Tensorboard.dev
Final thoughts
Connecting layers
Multilayer perception
Data download
Data processing
Label encoding
Final thoughts
Obtain dataset
Data processing
Create an optimizer
Final thoughts
TensorFlow
PyTorch
Final thoughts
Object detection with TensorFlow 2 Object detection API
Visualize detections
Final thoughts
Appendix
Disclaimer
Copyright
https://github.com/mlnuggets/tensorflow
TensorFlow
11
the network, while the axion terminals represent the output. The
output.
12
Random weights and biases are initialized when data is passed
leading to output.
13
14
Sigmoid function
The sigmoid activation function caps output to a number
answer.
15
learn?
the weights that minimize the error, also known as the loss
You might see nans in the loss function while training the
Gradient descent
to the bottom of the hill, that is, the global minimum. This
17
How backpropagation works
as backpropagation.
18
There are several variants of gradient descent. They include:
19
What is TensorFlow?
on Google Colab. You will, therefore, not install it when working in this
environment.
You can also install TensorFlow using Docker. Docker is the easiest way to
install TensorFlow on Linux if GPU support is
desired.
able image
docker run -it -p 8888:8888 tensorflow/tensorflow:latest-jupyt
GPUs on Mac.
20
Why TensorFlow?
TensorFlow:
common problems.
Well documented.
faster.
learning networks.
TensorFlow basics
21
Tensors
to NumPy arrays. Tensors are immutable, meaning they can not be updated
once created. Tensors can contain integers,
import tensorflow as tf
import numpy as np
print(x)
print(x.shape)
print(x.dtype)
# tf.Tensor(
# [[ 7. 8. 9.]
# (2, 3)
# <dtype: 'float32'>
NumPy arrays.
x[0]
9.],
# dtype=float32)>
x[1:4]
x**2
np.array(x)
x.numpy()
tensors.
try:
tensor = tf.constant(ragged_list)
except Exception as e:
print(f"{type(e).__name__}: {e}")
# ValueError: Can't convert non-rectangular Python sequence t
o Tensor.
ragged_tensor = tf.ragged.constant(ragged_list)
23
print(ragged_tensor)
print(ragged_tensor.shape)
# (4, None)
2]],
values=[1, 2],
dense_shape=[3, 4])
print(sparse_tensor, "\n")
# SparseTensor(indices=tf.Tensor(
# [[0 0]
(2,), dtype=int64))
24
print(tf.sparse.to_dense(sparse_tensor))
# tf.Tensor(
#[[1 0 0 0]
# [0 0 2 0]
Variables
A TensorFlow variable is used to represent state in
my_variable = tf.Variable(my_tensor)
my_variable
# array([[8., 8.],
25
print("As NumPy: ", my_variable.numpy())
# Shape: (2, 2)
# [6. 5.]]
Automatic differentiation
x = tf.Variable(47.0)
y = x**2
# dy = 2x * dx
dy_dx = tape.gradient(y, x)
dy_dx.numpy()
# 94.0
The GradientTape
e...
with tf.GradientTape() as t:
26
Tape
W and b
model.w.assign_sub(learning_rate * dw)
model.b.assign_sub(learning_rate * db)
save and restore them without the original Python code. As a result, these
graphs can be used in non-Python environments such as mobile devices,
servers, edge devices, and
27
TensorFlow graph representing a two-layer neural network
def simple_relu(x):
if tf.greater(x, 0):
28
return x
else:
return 0
e_relù.
tf_simple_relu = tf.function(simple_relu)
(1)).numpy())
(-1)).numpy())
inherits tf.Module .
class SimpleModule(tf.Module):
super().__init__(name=name)
lse, name="do_not_train_me")
simple_module = SimpleModule(name="simple")
simple_module(tf.constant(5.0))
29
How to train artificial neural networks
with Keras
import pandas as pd
df = pd.read_csv("train.csv")
df.head()
Data pre-processing
30
dealing with null values, but in this case, we'll replace them with
s'].mean()
31
labelencoder = LabelEncoder()
df = df.assign(satisfaction = labelencoder.fit_transform(df["sa
tisfaction"]))
Data transformation
Apart from the target column, other columns are also in text
categories = df.select_dtypes(include=['object']).columns.tolis
t()
categories
to scale can lead to nans in the training loss due to the large
32
hot encoding.
Scale numerical columns using MinMaxScaler to ensure that all values are
between 0 and 1.
random_state = 13
test_size = 0.3
nder=MinMaxScaler())
X = transformer.fit_transform(X)
size=test_size,random_state=random_state)
33
network
the Sequential API. We can stack the layers we want in our network using
this API. In this case, let's define a network with
the following layers:
layers, but we'll look at how to select the best units later.
problem.
model = Sequential(
Input(shape=(X_train.shape[1],)),
t_uniform",name="layer1"),
_uniform", name="layer2"),
Apart from the number of units, the dense layer has other
parameters:
Activation function, usually ReLu.
34
The kernel initializer that determines how the weights will be initialized.
35
metrics_df = pd.DataFrame(history.history)
metrics_df[["loss","val_loss"]].plot();
metrics_df[["accuracy","val_accuracy"]].plot();
36
You can tell the network is learning if the training and validation
37
overfitting
during the training process. This forces the network to learn the
model = Sequential(
Input(shape=(X_train.shape[1],)),
Dense(64, activation="relu", kernel_initializer="gloro
t_uniform",name="layer1"),
Dropout(rate=0.1),
_uniform", name="layer2"),
])
batch normalization
38
model = Sequential(
Input(shape=(X_train.shape[1],)),
t_uniform",name="layer1"),
BatchNormalization(),
_uniform", name="layer2"),
39
the loss and halt training when the loss is no longer improving for the
number of epochs specified. In this case, we stop
epochs.
callbacks = [tf.keras.callbacks.EarlyStopping(monitor='loss', p
atience=3)]
checkpoints
Apart from halting the training, you may want to save the best
only.
validation accuracy.
40
checkpoint_filepath = "model_checkpoint"
callbacks = [
tf.keras.callbacks.EarlyStopping(monitor="loss", p
atience=3),
tf.keras.callbacks.ModelCheckpoint(
filepath=checkpoint_filepath,
save_weights_only=True,
monitor="val_accuracy",
mode="max",
save_best_only=True)]
Checkpoint files
model.load_weights(checkpoint_filepath)
41
Make predictions on the test set
Let's now use this model to make predictions on the test set.
y_pred = model.predict(X_test)
matrix.
cm = confusion_matrix(y_test, y_pred)
cm
# array([[17396, 387],
# [ 856, 12533]])
import numpy as np
42
model.save_weights('checkpoint')
for
save the entire model, use model.save and pass the folder
model.save_weights('./checkpoints/my_checkpoint')
model.save("saved_model")
following items:
43
You can then load the model and check its architecture.
new_model = tf.keras.models.load_model('saved_model')
new_model.summary()
44
rmat
model.save('my_model.h5')
new_model = tf.keras.models.load_model('my_model.h5')
cross-validation
Next, let's look at how we can evaluate the Keras model
45
# https://www.adriangb.com/scikeras/stable/
def make_model():
model = Sequential([
Input(shape=(X_train.shape[1],)),
t_uniform",name="layer1"),
_uniform", name="layer2"),
return model
zer="adam", metrics=["accuracy"],loss="binary_crossentropy",val
idation_split=0.2, epochs=1)
46
The metrics.
The loss.
The optimizer.
mean = accuracies.mean()
mean
# 0.9180552089621082
variance = accuracies.var()
variance
# 8.615227293449488e-05
Keras
47
params = {
"batch_size":[10,20,32,64],
"epochs":[2,3,4],
"optimizer":["adam","rmsprop"]
The parameters.
grid_search = GridSearchCV(estimator=model,
param_grid=params,
scoring="accuracy",
cv=2)
The next step is to fit the grid search to the training data.
grid_search = grid_search.fit(X_train,y_train)
best_param
48
best_accuracy = grid_search.best_score_
best_accuracy
# 0.9392839465434747
model = Sequential()
model.add(Input(shape=(X_train.shape[1],)))
model.add(Dense(hidden_layer_size, activation="relu"))
model.add(Dropout(dropout))
model.add(Dense(1, activation="sigmoid"))
return model
predictions.
params = {
49
}
gs = GridSearchCV(my_model, params, scoring='accuracy', n_jobs=
-1, verbose=True)
gs.fit(X_train, y_train)
print(gs.best_score_, gs.best_params_)
Final thoughts
network.
layers.
50
backpropagation.
TensorFlow basics.
TensorFlow
What is CNN?
features from data. CNNs are primarily used in image tasks and in other
problems such as natural language processing tasks.
Convolution
detector over the input image. The feature detector also goes
52
Slide the kernel through the entire input image to obtain all
53
The kernel moves over the input images through steps known
the network.
A 3 by 3 convolutions operation.
The size of the feature map is the same as the size of the
kernel.
Padding
Applying the kernel reduces the output to the size of the kernel.
However, keeping the same image size after applying the
image by adding zeros around the image such that when the
54
kernel is applied, the output has the same size as the input
Same to pad such that the size of the input image and the
Apply ReLU
55
values below zero to zero while the others are returned as the
actual values.
Pooling
At this point, we have a feature map. It is desirable to reduce
the size of the feature map further. This is done via a process
filter is applied to reduce the size of the feature map. This filter
Max pooling where the filter slides over the feature map
56
a given box.
Dropout regularization
Flattening
It's time to pass the pooled feature map to a fully connected
Full connection
Activation function
57
(CNN) in TensorFlow
With the basics out of the way, let's build CNNs with
installed.
How to install TensorFlow
You can also install TensorFlow using Docker. Docker is the easiest way to
install TensorFlow on Linux if GPU support is
58
desired.
docker pull tensorflow/tensorflow:latest # Download latest st
able image
GPUs on Mac.
installed
printed.
import tensorflow as tf
print(tf.__version__)
59
Let's use the Fashion MNIST dataset to illustrate how to build multilayer
CNN models with TensorFlow. The dataset contains
60
https://github.com/zalandoresearch/fashion-mnist
Data preprocessing
First, load the dataset. We use Layer to achieve this.
import layer
mnist_train = layer.get_dataset('layer/fashion_mnist/datasets/f
61
ashion_mnist_train').to_pandas()
mnist_test = layer.get_dataset('layer/fashion_mnist/datasets/fa
shion_mnist_test').to_pandas()
mnist_train["images"][17]
mnist_test["images"][23]
import numpy as np
def images_to_np_array(image_column):
return np.array([np.array(im.getdata()).reshape((im.size
train_images = images_to_np_array(mnist_train.images)
test_images = images_to_np_array(mnist_test.images)
train_labels = mnist_train.labels
test_labels = mnist_test.labels
Model definition
62
Now that the dataset is ready, define the CNN network. The
model = keras.Sequential(
keras.Input(shape=(parameters["shape"], parameters["shap
e"], 1)),
layers.Conv2D(32, kernel_size=(parameters["kernel_size"],
parameters["kernel_size"]), activation=parameters["activatio
n"]),
layers.MaxPooling2D(pool_size=(parameters["pool_size"], p
arameters["pool_size"])),
layers.Conv2D(64, kernel_size=(parameters["kernel_size"],
parameters["kernel_size"]), activation=parameters["activatio
n"]),
Deep learning with TensorFlow and Keras
63
layers.MaxPooling2D(pool_size=(parameters["pool_size"], p
arameters["pool_size"])),
layers.Flatten(),
layers.Dropout(parameters["dropout"]),
layers.Dense(parameters["classes"], activation="softma
x"),
use sparse categorical cross-entropy because the labels are integers. The
categorical cross-entropy is used when the labels
loss=tf.keras.losses.SparseCategoricalCrossentropy
(),
metrics=['accuracy'])
64
the fit method. Apart from the training and validation data,
epochs.
ata=(test_images,test_labels), epochs=parameters["epochs"])
validation metrics.
metrics_df = pd.DataFrame(history.history)
metrics_df[["loss","val_loss"]].plot();
65
Model evaluation
verbose=2)
predictions = model.predict(test_images)
66
df = pd.DataFrame(predictions, columns=
CNN models can take a long time to train, especially when the
callbacks = [tf.keras.callbacks.EarlyStopping(monitor='accurac
67
model.compile(optimizer=parameters["optimizer"],
loss=tf.keras.losses.SparseCategoricalCrossentropy
(),
metrics=['accuracy'])
ata=(test_images,test_labels), epochs=parameters["epochs"],call
backs=callbacks)
batch normalization
model = keras.Sequential(
keras.Input(shape=(parameters["shape"], parameters["shap
e"], 1)),
layers.Conv2D(32, kernel_size=(parameters["kernel_size"],
parameters["kernel_size"]), activation=parameters["activatio
n"]),
layers.MaxPooling2D(pool_size=(parameters["pool_size"], p
arameters["pool_size"])),
layers.Conv2D(64, kernel_size=(parameters["kernel_size"],
parameters["kernel_size"]), activation=parameters["activatio
Deep learning with TensorFlow and Keras
68
n"]),
layers.MaxPooling2D(pool_size=(parameters["pool_size"], p
arameters["pool_size"])),
layers.Flatten(),
layers.Dropout(parameters["dropout"]),
layers.Dense(64, activation="relu"),
layers.BatchNormalization(),
layers.Dense(parameters["classes"], activation="softma
x"),
model.compile(optimizer=parameters["optimizer"],
loss=tf.keras.losses.SparseCategoricalCrossentropy
(),
metrics=['accuracy'])
ata=(test_images,test_labels), epochs=parameters["epochs"])
every epoch.
class CustomCallback(Callback):
keys = list(logs.keys())
(epoch, keys))
model.compile(optimizer=parameters["optimizer"],
69
loss=tf.keras.losses.SparseCategoricalCrossentropy
(from_logits=True),
metrics=['accuracy'])
_images,test_labels),epochs=parameters["epochs"], callbacks=[Cu
stomCallback()])
learning model
model.summary()
70
Output shape for each network.
tf.keras.utils.plot_model(
model,
to_file="model.png",
show_shapes=True,
show_layer_names=True,
71
rankdir="TB",
expand_nested=True,
dpi=96,
You can follow this plot from the top to see how the shapes
72
model
model.save_weights('./checkpoints/my_checkpoint')
model.save("saved_model")
new_model = tf.keras.models.load_model('saved_model')
new_model.summary()
Deep learning with TensorFlow and Keras
73
You can then load the model and use it for predictions or re-
train it.
74
Running CNNs with TensorFlow
To run CNNs in the real world, we need the ability to load and
process image data from a folder. In this part of the article, we'll
75
Loading the images
import tarfile
wget.download("http://data.vision.ee.ethz.ch/cvl/food-101.tar.g
z")
food_tar = tarfile.open('food-101.tar.gz')
food_tar.extractall('.')
food_tar.close()
Generate a tf.data.Dataset
arguments:
76
transformations.
base_dir = 'food-101/images'
batch_size = 32
img_height = 128
img_width = 128
import tensorflow as tf
training_set = tf.keras.utils.image_dataset_from_directory(
base_dir,
label_mode="int"
validation_split=0.2,
subset="training",
seed=100,
image_size=(img_height, img_width),
batch_size=batch_size)
TensorFlow will infer the labels of the images from the directory
structure.
validation_set = tf.keras.utils.image_dataset_from_directory(
base_dir,
validation_split=0.2,
subset="validation",
seed=100,
77
image_size=(img_height, img_width),
batch_size=batch_size)
class_names = training_set.class_names
print(class_names)
plt.figure(figsize=(10, 10))
for i in range(9):
ax = plt.subplot(3, 3, i + 1)
plt.imshow(images[i].numpy().astype("uint8"))
plt.title(class_names[labels[i]])
plt.axis("off")
78
Buffered dataset prefetching
be greater than the batch size. You can set this manually or
79
AUTOTUNE = tf.data.AUTOTUNE
training_ds = training_set.cache().shuffle(1000).prefetch(buffe
r_size=AUTOTUNE)
validation_ds = validation_set.cache().prefetch(buffer_size=AUT
OTUNE)
Image augmentation
Random rotations.
Random zoom.
data_augmentation = keras.Sequential(
[
layers.RandomFlip("horizontal",
input_shape=(img_height,
img_width,
3)),
layers.RandomRotation(0.1),
layers.RandomZoom(0.1),
80
plt.figure(figsize=(10, 10))
for i in range(9):
augmented_images = data_augmentation(images)
ax = plt.subplot(3, 3, i + 1)
plt.imshow(augmented_images[0].numpy().astype("uint8"))
plt.axis("off")
81
Model definition
82
model = keras.Sequential([
data_augmentation,
layers.Rescaling(1./255),
layers.Conv2D(filters=32,kernel_size=(3,3),activation='rel
u'),
layers.MaxPooling2D(pool_size=(2,2)),
layers.Conv2D(filters=32,kernel_size=(3,3), activation='rel
u'),
layers.MaxPooling2D(pool_size=(2,2)),
layers.Dropout(0.25),
layers.Conv2D(filters=64,kernel_size=(3,3), activation='rel
u'),
layers.MaxPooling2D(pool_size=(2,2)),
layers.Dropout(0.25),
layers.Flatten(),
layers.Dense(128, activation='relu'),
layers.Dropout(0.25),
layers.Dense(len(class_names), activation='softmax')])
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentrop
83
y(),
metrics=['accuracy'])
ience=3)
epochs=100
history = model.fit(training_set,validation_data=validation_se
t, epochs=epochs,callbacks=[callback])
Model evaluation
performance
import pandas as pd
metrics_df = pd.DataFrame(history.history)
84
metrics_df[["loss","val_loss"]].plot();
metrics_df[["accuracy","val_accuracy"]].plot();
TensorBoard
parameters:
network.
log_folder ="logs"
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=l
og_folder,
histogram
_freq=1,
write_graph=True,
write_images=True,
Deep learning with TensorFlow and Keras
85
update_freq='epoch'
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentrop
y(),
metrics=['accuracy'])
model.fit(training_set,validation_data=validation_set,epochs=2,
callbacks=[tensorboard_callback])
86
every batch.
87
profile analysis.
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=l
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentrop
y(),
metrics=['accuracy'])
# model.fit(training_set,validation_data=validation_set,epochs=
2, callbacks=[tensorboard_callback])
0.2, epochs=1,callbacks=[tensorboard_callback])
%tensorboard --logdir={log_folder}
performance.
88
The TensorFlow Stats tool shows the performance of all
89
The Trace Viewer under the tools section shows performance
90
Making predictions
image_url = "https://upload.wikimedia.org/wikipedia/commons/b/b
1/Buttermilk_Beignets_%284515741642%29.jpg"
e_url)
test_image = tf.keras.utils.load_img(
img_array = tf.keras.utils.img_to_array(test_image)
img_array = tf.expand_dims(img_array, 0)
91
Source
prediction = model.predict(img_array)
prediction
92
function sums to 1.
import tensorflow as tf
import numpy as np
scores = tf.nn.softmax(prediction[0])
scores = scores.numpy()
93
CNN architectures
be specific to a task.
Xception
ResNet50
InceptionV3
MobileNetV2
DenseNet121
NASNetLarge
EfficientNetB1
94
model = tf.keras.applications.ResNet152(
include_top=True,
weights="imagenet",
input_tensor=None,
input_shape=None,
pooling=None,
classes=1000,
classifier_activation="softmax",
the image size used to train this ResNet network. We also need
to process the image the same way the training images were
ut, decode_predictions
test_image = tf.keras.utils.load_img(
img_array = tf.keras.utils.img_to_array(test_image)
img_array = tf.expand_dims(img_array, 0)
x = preprocess_input(img_array)
preds = model.predict(x)
95
1112)]
on custom data, you set this to false and then include another
You may also want to load the CNN architecture without the
weights. Doing this means that you will start the training from
scratch. In most cases, you'll want to load the networks with the
done.
model = tf.keras.applications.ResNet152(
include_top=True,
weights="imagenet",
input_tensor=None,
96
input_shape=None,
pooling=None,
classes=1000,
classifier_activation="softmax",
Final thoughts
epochs.
TensorBoard.
97
Networks
points. For example, the average sales made per month over a
to month, meaning that the sales for the first month are the only
independent sales. The rest are dependent on the sales made
prior.
Network?
networks.
node to the hidden layers until the output layer. These types of
making predictions.
98
Traditional Feed-Forward Network
input, which is the previous input from the earlier layers. Thus,
99
apply their weights to both the current and the previous input.
They also tweak the weights for the gradient descent during
uncover.
backpropagation.
100
previous layer.
to the network.
output.
Networks
101
captioning RNN.
Weaknesses of RNNs
Let's now talk about some of the challenges you will encounter
backpropagation.
102
Meaning that the set of input values from the start would
prediction.
in a particular text:
If the text was, "Please go and get me a very big glass of water
now!".
103
say in our case that only the text that comes after 'very' would be useful in
the text classifier neural network.
2. Exploding gradient problem
In exploding gradients, the gradients accumulate and become
so big that the updates made to the neural network weights are
very large during training. This occurs when the gradients of the
way, the result is an unstable neural network that does not give
accurate outputs.
neural network:
2. If the changes made to the loss are very large after each
update.
104
(LSTM)
105
Opensource LSTM image by Wikimedia
cells.
106
3. Reset - This is the forget gate. It gets rid of information within a cell that
is no longer necessary.
memory units are what account for the long-term recall ability of
Applications of LSTM
couple:
4. Speech synthesis.
Bidirectional LSTM
107
connected to the same output layer. The implication here is that the nodes in
the Bidirectional RNN have sequential information
1. Text classification.
2. Speech classification.
3. Forecasting models.
in TensorFlow
108
Imports
functionality.
import pandas as pd
import numpy as np
import tensorflow as tf
import keras
Pandas is used to load the dataset as a DataFrame along with other pre-
processing steps.
the data will move from the first layer, the input layer, to the
network.
109
model.
Data pre-processing
loans = pd.read_csv("loans.csv")
loans = loans[['created_at','amount']]
loans['created_at'] = pd.DatetimeIndex(loans['created_at'])
loans = loans.groupby(['created_at']).amount.sum().reset_index
()
loans.sort_values(by=['created_at'], inplace=True)
loans = loans.set_index('created_at')
110
The steps involved here are as follows:
111
scaler = MinMaxScaler(feature_range=(0,1))
scaled_loans = scaler.fit_transform(loans)
Next, prepare the data for loading to the LSTM Model. We feed
the neural network with enough data that it can predict the next
obtain the loans given for sixty days and use the value to
x_train = []
y_train = []
x_train.append(scaled_loans[i-60:i, 0])
y_train.append(scaled_loans[i, 0])
112
y_train = np.array(y_train)
x_train = np.array(x_train)
_train)[1], 1 ))
regressor = Sequential()
=(np.shape(x_train)[1],1)))
regressor.add(Dropout(0.2))
regressor.add(LSTM(units=50, return_sequences=True))
regressor.add(Dropout(0.2))
regressor.add(LSTM(units=50, return_sequences=True))
regressor.add(Dropout(0.2))
regressor.add(LSTM(units=50))
113
regressor.add(Dropout(0.2))
regressor.add(Dense(units=1))
regressor.add(Dropout(0.2))
regressor.add(Dense(units=1))
regressor.summary()
For each successive LSTM layer in the hidden layer, you will
find that it does not have the input shape defined since it takes
the input of the preceding layer. Also, other than the last LSTM
network, including:
114
regressor.add(Bidirectional(LSTM(units=50, return_sequences=Tru
e)))
115
The first line sets the initial value for the learning rate. This
optimizer.
epochs .
performance.
h:1e-7 * 10**(epoch/20))
opt = tensorflow.keras.optimizers.Adam(learning_rate=1e-7)
pe'])
early_stopping = tf.keras.callbacks.EarlyStopping(monitor='los
s', mode='min',patience=20)
mc = tf.keras.callbacks.ModelCheckpoint('best_model.h5', monito
116
The last step here is to fit the x_train and y_train into the
regressor model.
117
the following:
prediction.
if(steps == 0):
print(prediction)
return prediction
else:
pred = np.append(pred,prediction)
steps = steps-1
print(prediction)
predictSteps(pred, steps)
predictSteps(x_copy[-1], 10)
118
LSTM model evaluation
To get the metrics used in the model access keys from the
history:
print(hist.history.keys())
using Matplotlib.
plt.plot(hist.history['loss'])
119
the model.
n = keras.metrics.MeanAbsoluteError()
m.update_state(actual, prediction)
n.update_state(actual, prediction)
err = m.result().numpy()
err_1 = n.result().numpy()
the MeanPercentageError .
120
print(train_errors)
Intent classification with LSTM
the posts and comments of their user base, they may need to
department.
121
than meaning.
account rather than just the word itself. This enables the word's
Imports
in the sentences.
122
import pandas as pd
import re
import tensorflow as tf
import keras
mbedding, SpatialDropout1D
import nltk
nltk.download("stopwords")
Load dataset
complaints = pd.read_csv("complaints.csv")
t']]
complaints.dropna(inplace=True)
Deep learning with TensorFlow and Keras
123
Data cleaning
digits and other symbols combined with text data and hence
symbols_regex = re.compile('[/(){}\[\]\|@,;]')
text = text.replace('\d+','')
text = text.lower()
return text
124
Label exploration
complaints['Product'].value_counts().sort_values(ascending=Fals
e)
125
Text vectorization
vectorize_layer = tf.keras.layers.TextVectorization(standardize
='lower_and_strip_punctuation',max_tokens=5000,output_mode='in
t',output_sequence_length=512)
vectorize_layer.adapt(complaints_text,batch_size=None)
X_train_padded = vectorize_layer(complaints_text)
X_train_padded = X_train_padded.numpy()
Since the neural network can only have numbers as its input,
numbers.
le = sklearn.preprocessing.LabelEncoder()
complaints['Product'] = le.fit_transform(complaints['Product'])
y = complaints['Product']
126
classifier = Sequential()
classifier.add(SpatialDropout1D(0.2))
classifier.add(Dense(17, activation='softmax'))
127
The final layer of the model has 17 cells – for the 17 different
classifier.evaluate(X_test,y_test)
Final thoughts
128
recall ability.
2019, is estimated to have cost around $1.6m to train. Such a cost would
make it difficult for individuals or small organizations
Luckily, you do not have to train models such as GPT- 2. You can obtain a
copy of such models free of charge from the
129
language processing.
What is transfer learning?
save time.
set of data.
This approach not only reduces training time but also lowers
learning idea.
130
Transfer learning
example:
effectively.
131
are the same. However, the source and target problems vary.
scratch.
domains are similar, but the problems differ. In this case, the
132
133
knowledge transfer.
data with the same feature space as the target domain. In such
such constraints.
134
What is the difference between
tuning?
scratch.
135
136
work?
cases, you might end up with modest results. Such cases occur
137
articles dataset. To solve this, do domain adaption first and then train the
models on the target task to improve performance.
learning?
138
139
After determining the suitable model for your problem, the next
step involves acquiring the model. More about this later in the
article.
140
(RNNs).
units in the final output layer than you need. You will have to
drop the final output layer and incorporate a final output layer
141
Step 3: Freeze layers so they don’t change
during training
can freeze the initial (let's say k) layers of the pre-trained model and train a
new model with the remaining(n-k) layers.
from scratch.
be non-trainable.
142
base_model.trainable = False
Or using:
base_model.trainable = 0
We are only using the feature extraction layers from the base
frozen layers to learn new weights and features for the new
143
model. For example, the final dense layer should represent the
Setting a low learning rate for the model will prevent overfitting
have a very low learning rate because you are training the
to overfitting.
144
When using the base model with frozen layers, the model's
again whenever any changes are made to the model for the
models?
Keras applications
Tensorflow Hub
PyTorch Hub
i.e., 97.5%, while MobileNet has the least top accuracy, i.e., 89.5%.
You can select any model for your problem. Once you have
import tensorflow as tf
pretrained_model = tf.keras.applications.MobileNetV3Small(
alpha=1.0,
include_top=True,
weights="imagenet",
input_tensor=None,
pooling=None,
classes=1000,
classifier_activation="softmax"
pretrained_model.trainable = False
pretrained_model.summary()
TensorFlow Hub
147
import tensorflow as tf
import tensorflow_hub as hub
mobilenet_v2 ="https://tfhub.dev/google/imagenet/mobilenet_v3_s
mall_100_224/classification/5"
classifier_model = mobilenet_v2
IMAGE_SHAPE = 224
classifier = tf.keras.Sequential([
hub.KerasLayer(classifier_model,
])
classifier.summary()
148
In NLP applications, the goal is to generate a representation of
Google’s Word2vec
Fasttext
!wget http://nlp.stanford.edu/data/glove.6B.zip
!unzip glove*.zip
149
es
import numpy as np
x = {'the', 'match', 'score', 'prime',
tokenizer = Tokenizer()
tokenizer.fit_on_texts(x)
embedding_dimension):
vocabulary_size = len(word_index) + 1
VocabEmbeddingMatrix = np.zeros((vocabulary_size,
embedding_dimension))
for line in f:
if word in word_index:
idx = word_index[word]
VocabEmbeddingMatrix[idx] = np.array(
vector, dtype=np.float32)[:embedding_dimens
ion]
return VocabEmbeddingMatrix
embedding_dimension = 50
VocabEmbeddingMatrix = embedding_for_vocab(
'glove.6B.50d.txt', tokenizer.word_index,
embedding_dimension)
VocabEmbeddingMatrix[1])
150
Google’s Word2vec
!wget http://vectors.nlpl.eu/repository/20/51.zip
#unzip
!unzip 51.zip
!gzip model.bin
import gensim
EMBEDDING_FILE = 'model.bin.gz'
word_vectors = KeyedVectors.load_word2vec_format(EMBEDDING_FIL
E, binary=True)
e'], negative=['man'])
atch
print(f"{most_similar_key}: {similarity:.4f}")
151
Fasttext
import re
m, source_str))
return tokens
counter = nlp.data.count_tokens(tokenizer(sentence))
#create vocabulary
vocab = nlp.Vocab(counter)
#attach embedding
vocab.set_embedding(fasttext_model)
vocab.embedding['player'][:5]
152
Hugging Face
Face for NLP tasks. You can use HuggingFace for problems
related to:
Question answering
Summarization
Translation
Text generation
Install transformers:
fication
tokenizer = AutoTokenizer.from_pretrained("dslim/bert-base-NE
R")
model = AutoModelForTokenClassification.from_pretrained("dslim/
bert-base-NER")
ner_results = nlp(sentence)
print(ner_results)
153
import torchvision
model_conv = torchvision.models.resnet18(pretrained=True)
models?
154
Prediction
155
path = 'satyabratasm-u_kMWN-BWyU-unsplash.jpg'
img = img_to_array(img)
[2]))
img = preprocess_input(img)
et
yhat = model.predict(img)
from keras.applications.vgg16 import decode_predictions
label = label[0][0]
Feature extraction
156
From the VGG16 model below, the input layer to the last max
import tensorflow as tf
import numpy as np
#pre-trained model
24))
image_data = img_to_array(image)
157
image_data = preprocess_input(image_data)
extracted_features = model.predict(image_data)
print (extracted_features.shape)
problem.
acquired (or part of it) and re- training the model on the new
158
model.trainable = True
#Apart from the 10 last layers, freeze all the other layers
layer.trainable = False
# compile and retrain with a very low learning rate
learning_rate = 1e-4
model.compile(loss = 'binary_crossentropy',
optimizer = tf.keras.optimizers.RMSprop(learning_
rate = low_learning_rate),
metrics = ['acc']
The best choice here depends on the new task. You might
159
model.
(optional).
data
enough dataset, the model can act as a generic model for other
similar sub-tasks.
Deep learning with TensorFlow and Keras
160
relatively good results while reducing the time taken to train the
The data can be obtained from TensorFlow datasets or downloaded from the
dataset's repository. We will demonstrate
both approaches.
n-us/download/confirmation.aspx?id=54765
!wget --no-check-certificate \
"https://download.microsoft.com/download/3/E/1/3E1C3F21-ECD
B-4869-8368-6DEBA77B919F/kagglecatsanddogs_5340.zip"
#remove previous files
161
irectory
dir = "PetImages/"
data = image_dataset_from_directory(dir,
shuffle=True,
batch_size=32,
image_size=(150, 1
50))
"cats_vs_dogs",
rdinality(train_data))
print(
rdinality(validation_data)
ality(test_data))
162
ax = plt.subplot(2, 2, i + 1)
plt.imshow(img)
plt.title(int(label))
plt.axis("off")
plt.show()
163
Data pre-processing
164
2. Gray-scaling
3. Shifts
4. Flips
5. Brightness
6. Zoom
data_augmentation = keras.Sequential(
layers.RandomFlip(
mages
plt.figure(figsize=(10, 10))
165
first_image = images[7]
for i in range(9):
ax = plt.subplot(3, 3, i + 1)
augmented_image = data_augmentation(
plt.imshow(augmented_image[0].numpy().astype("int32"))
plt.axis("off")
plt.show();
166
Inception model
Inception model.
base_model = keras.applications.InceptionV3(
t.
base_model.trainable = False
167
images.
#rescale
1)
x = scale_layer(x)
x = base_model(x, training=False)
x = keras.layers.GlobalAveragePooling2D()(x)
outputs = keras.layers.Dense(1)(x)
model.summary()
168
The new dense layer has 1,281 parameters which we will be
accuracy.
First, compile the model to include the additional layer and train
rd
%load_ext tensorboard
log_folder = 'image_logs'
callbacks = [
EarlyStopping(patience = 3),
TensorBoard(log_dir=log_folder)
model.compile(optimizer='adam',
loss=tf.keras.losses.BinaryCrossentropy(from_logi
ts=True),
metrics=keras.metrics.BinaryAccuracy())
hist = model.fit(train_data,
epochs=5,
callbacks)
170
%reload_ext tensorboard
171
#unfreeze the base model
base_model.trainable = False
#Apart from the 5 last layers, freeze all the other layers
model.summary()
learning_rate = 1e-5
model.compile(
ning rate
loss=keras.losses.BinaryCrossentropy(from_logits=True),
metrics=[keras.metrics.BinaryAccuracy()],
172
Next, train the model on a few epochs and a low learning rate.
%load_ext tensorboard
log_folder = 'fine_tune_logs'
callbacks = [
EarlyStopping(patience = 5),
TensorBoard(log_dir=log_folder)
epochs = 5
hist1 = model.fit(train_data,
epochs=epochs,
validation_data=validation_data,callbacks=callbacks)
To prevent overfitting, monitor the training loss using a callback
173
%reload_ext tensorboard
processing
174
1. Google’s Word2vec
2. Stanford’s GloVe
detecting sentiments.
175
!wget !wget https://archive.ics.uci.edu/ml/machine-learning-dat
abases/00462/drugsCom_raw.zip
!unzip drugsCom_raw.zip
import pandas as pd
import tensorflow as tf
es
import numpy as np
import re
df = pd.read_csv('drugsComTrain_raw.tsv', sep='\t')
176
df = df[['review', 'category']].copy()
df.head()
drug users. We will split the data and use 70% for training and
Data pre-processing
Data pre-processing facilitates extracting the most relevant
177
iew]
df.head()
import tensorflow as tf
vectorize_layer = tf.keras.layers.TextVectorization(standardize
='lower_and_strip_punctuation',max_tokens=max_features,output_m
178
ode='int',output_sequence_length=max_len)
vectorize_layer.adapt(list((df['review'].values)),batch_size=No
ne)
Below is a sample integer representation of the first sentence in
X_train[0]
X_t = list((df['review'].values))
est_size = 0.30)
X_train = vectorize_layer(X_train)
X_test = vectorize_layer(X_test)
179
# download glove
!wget http://nlp.stanford.edu/data/glove.6B.zip
# unzip it in Notebook
!unzip glove*.zip
# unzip it in Notebook
!unzip glove*.zip
embeddings.
embeddings_index = {}
emb = open('glove.6B.100d.txt')
values = sentence.split()
word = values[0]
embeddings_index[word] = coefs
emb.close()
180
Create the embedding layer
vocabulary.
#get vocabulary
voc = vectorize_layer.get_vocabulary()
num_tokens = len(voc) + 2
embedding_dim = 100
hits = 0
misses = 0
embedding_vector = embeddings_index.get(word)
embedding_matrix[i] = embedding_vector
hits += 1
else:
misses += 1
defined.
output_dim : Length of the vector for each word.
embedding_layer = Embedding(
input_dim = num_tokens,
output_dim = embedding_dim,
embeddings_initializer=keras.initializers.Constant(embeddin
g_matrix),
182
trainable=False,
You can define your model using the resulting embedding layer.
and forward.
# define model
from tensorflow.keras.layers import Flatten
model = Sequential()
vocab_size = 10002
t_length=100, trainable=False)
model.add(e)
=0.1, recurrent_dropout=0.1)))
model.add(Flatten())
model.add(Dense(2, activation='sigmoid'))
rics=['accuracy'])
print(model.summary())
183
Training the model
Time to compile and train the model. You can use your
%load_ext tensorboard
log_folder = 'embed_logs'
rd
#apply callbacks
callbacks = [
EarlyStopping(patience = 3),
TensorBoard(log_dir=log_folder)
#compile
model.compile(loss='binary_crossentropy',optimizer='adam',metri
cs=['accuracy'])
184
num_epochs = 10
60)
The model begins with an accuracy of 70.63% on the validation
below.
%reload_ext tensorboard
185
Final thoughts
With the advent of technologies like big data and deep learning,
there is a pressing need to adopt new methodologies that
186
classification task.
processing task.
TensorBoard
Let's dive in and see how to use TensorBoard with all these
packages.
Advantages of using
Tensorboard
187
PIP installation
prompt:
188
Conda installation
following commands:
Docker installation
If you use a Docker image of the Jupyter Notebook server,
w:nightly-py3-jupyter
Colab
through pip:
notebook:
189
jupyter notebook
%reload_ext tensorboard
Next, set the log directory where all the logs will be stored.
rm -rf ./logs/
190
#for windows
import shutil
try:
shutil.rmtree('logs')
except:
pass
#for windows
import shutil
try:
shutil.rmtree('logsx')
except:
pass
("%Y%m%d-%H%M%S")
191
%load_ext tensorboard
Then define the model:
import datetime
import numpy as np
import tensorflow as tf
#LOaD DATA
iris = datasets.load_iris()
X = iris.data
y = iris.target
#normalize
X = normalize(X,axis = 0)
import keras
import tensorflow
import os
from sklearn.model_selection import train_test_split
iris = datasets.load_iris()
X = iris.data
y = iris.target
'''
70% -- train y
30% -- test y
192
'''
n, y_test
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
def create_model():
# Create the model
model = keras.models.Sequential()
model.add(Dense(3, activation='softmax'))
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
return model
create_model()
("%Y%m%d-%H%M%S"))
tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir, h
istogram_freq=1)
def train_model():
'''
'''
model = create_model()
tensorboard_callback = tf.keras.callbacks.TensorBoard(logdi
r, histogram_freq=1)
193
model.fit(x=X_train,
y=y_train,
epochs=10,
rboard_callback)
ss)
tf.debugging.experimental.enable_dump_debug_info(
"/tmp/tfdbg2_logdir",
tensor_debug_mode="FULL_HEALTH",
circular_buffer_size=-1)
train_model()
variable log_folder that points to the logs folder that you had
created.
callback
194
including:
Weight histograms.
Sampled profiling.
TensorBoard.
Import TensorBoard.
195
("%Y%m%d-%H%M%S"))
tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir, h
istogram_freq = 1, write_graph = False,write_images = False)
196
model.
The next step involves compiling and fitting the model using the
Y%m%d-%H%M%S")
s")
file_writer.set_as_default()
tensorboard_callback = keras.callbacks.TensorBoard(log_dir=logd
ir, histogram_freq=1)
def train_model():
'''
'''
model = create_model()
tensorboard_callback = tf.keras.callbacks.TensorBoard(logdi
r, histogram_freq=1)
model.fit(x=X_train,
y=y_train,
epochs=20,
orboard_callback])
197
# Get the accuracy of test data set
ss)
train_model()
Once the model is trained, the next step is to visualize it using
198
prompt.
or Google Colab:
If you have set everything right, you will see a window with
199
Running TensorBoard remotely
server:
200
If you are using PuTTY , you will need to replace ssh in the command with
PuTTY to create an ssh tunnel on port 6006 from
connected to with SSH. The tunnel you have created will stay
2. Transfer the port from the GPU server to the contact. Your
server:
201
components include:
TensorBoard scalars.
Images.
Graphs.
Distributions.
Histograms.
Fairness indicators.
TensorBoard scalars
learning rate.
Y%m%d-%H%M%S")
s")
file_writer.set_as_default()
202
tensorboard_callback = keras.callbacks.TensorBoard(log_dir=logd
ir, histogram_freq=1)
def train_model():
'''
'''
model = create_model()
tensorboard_callback = tf.keras.callbacks.TensorBoard(logdi
r, histogram_freq=1)
model.fit(x=X_train,
y=y_train,
epochs=20,
ss)
train_model()
Load TensorBoard.
203
You can also include custom scalars. For instance, if you want
def lr_schedule(epoch):
"""
gress.
"""
learning_rate = 0.2
learning_rate = 0.02
learning_rate = 0.01
204
learning_rate = 0.005
=epoch)
return learning_rate
lr_callback = keras.callbacks.LearningRateScheduler(lr_schedul
e)
tensorboard_callback = keras.callbacks.TensorBoard(log_dir=logd
ir)
'''
'''
model = create_model()
tensorboard_callback = tf.keras.callbacks.TensorBoard(logdi
r, histogram_freq=1)
model.fit(x=X_train,
y=y_train,
epochs=epochs,
orboard_callback, lr_callback])
ss)
train_model(4)
Notice that now you have a new scalar output– learning rate
TensorBoard images
below.
Deep learning with TensorFlow and Keras
206
#import libraries
import itertools
import datetime
import io
import tensorflow as tf
import numpy as np
import sklearn.metrics
import shutil
try:
shutil.rmtree('logsx')
except:
pass
fashion_mnist = keras.datasets.fashion_mnist
ion_mnist.load_data()
'Coat',
Y%m%d-%H%M%S")
file_writer = tf.summary.create_file_writer(logdir)
with file_writer.as_default():
207
%reload_ext tensorboard
want to visualize.
with file_writer.as_default():
208
TensorBoard graphs
the GRAPHS tab in the upper pane. From the upper left corner,
select your preferred run. You can view the model and align it
with your desired design.
209
You will notice options like an op-level graph that gives you
node.
TensorBoard distributions
TensorBoard histograms
211
You can specify the histogram mode as either OVERLAY :
Deep learning with TensorFlow and Keras
212
or OFFSET histogram:
various iterations.
Fairness indicators
Deep learning with TensorFlow and Keras
213
Regardless of how much care has been taken during the model
models.
nstall tensorboard-plugin-fairness-indicators
You will need to restart the kernel for the plugin to be included
214
What-If Tool (WIT)
Counterfactual reasoning.
215
Explore how general changes to data points affect
predictions.
provide:
Next, click Accept. The tool will do the rest and return the
results.
216
Displaying data in TensorBoard
embedding projector
217
embedding layers. This guide will consider a simple example of vectors and
metadata. You will use the SummaryWriter to write
embedding.
%load_ext tensorboard
import numpy as np
import tensorflow as tf
import tensorboard as tb
tf.io.gfile = tb.compat.tensorflow_stub.io.gfile
#install pytorch
1]])
writer.add_embedding(vectors, metadata)
writer.close()
%tensorboard --logdir=runs
218
TensorBoard
Before fitting the training model, you can visualize training data
as shown below.
219
# Download the mnist data. The data is already divided into tra in and test.
handwriting_mnist = keras.datasets.mnist
handwriting_mnist.load_data()
logdir = "logs/train_data/"
file_writer = tf.summary.create_file_writer(logdir)
import numpy as np
with file_writer.as_default():
220
Visualize images in
TensorBoard
dataset.
# Clear out prior logging data.
m%d-%H%M%S")
file_writer = tf.summary.create_file_writer(logdir)
221
# class names
class_names = ['0', '1', '2', '3', '4', '5', '6', '7', '8',
'9']
def plot_to_image(figure):
buf = io.BytesIO()
plt.savefig(buf, format='png')
tly inside
# the notebook.
plt.close(figure)
buf.seek(0)
image = tf.expand_dims(image, 0)
return image
def image_grid():
"""
ure.
"""
figure = plt.figure(figsize=(10,10))
for i in range(25):
plt.subplot(5, 5, i + 1, title=class_names[train_labels
[i]])
plt.xticks([])
plt.yticks([])
plt.grid(False)
plt.imshow(train_images[i], cmap=plt.cm.binary)
return figure
222
figure = image_grid()
with file_writer.as_default():
TensorBoard
Using the TensorFlow Text Summary API, you can log textual
223
("%Y%m%d-%H%M%S")
file_writer = tf.summary.create_file_writer(logdir)
with file_writer.as_default():
224
TensorBoard
225
#Importing Dataset
fashion_mnist = keras.datasets.fashion_mnist
ion_mnist.load_data()
'Coat',
#train model
model = keras.models.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(512, activation='relu'),
keras.layers.Dense(256, activation='relu'),
keras.layers.Dense(128, activation='relu'),
keras.layers.Dense(64, activation='relu'),
keras.layers.Dense(32, activation='relu'),
keras.layers.Dense(10, activation='softmax')
])
model.compile(loss='sparse_categorical_crossentropy', optimizer
='adam', metrics=['accuracy'])
the LambdaCallback .
plt.colorbar()
tick_marks = np.arange(len(class_names))
plt.yticks(tick_marks, class_names)
Deep learning with TensorFlow and Keras
226
threshold = cm.max() / 2.
shape[1])):
r", color=color)
plt.tight_layout()
plt.ylabel('Real Class')
plt.xlabel('Predicted Class')
return figure
m%d-%H%M%S")
tensorboard_callback = keras.callbacks.TensorBoard(log_dir=logd
ir)
dataset.
test_pred_raw = model.predict(test_images)
cm = sklearn.metrics.confusion_matrix(test_labels, test_pre
d)
cm_image = plot_to_image(figure)
with file_writer_cm.as_default():
ch)
cm_callback = keras.callbacks.LambdaCallback(on_epoch_end=log_c
onfusion_matrix)
227
train_images,
train_labels,
epochs=2,
verbose=0,
callbacks=[tensorboard_callback, cm_callback],
validation_data=(test_images, test_labels),
# Starting TensorBoard.
228
Hyperparameter tuning with
TensorBoard
model.
229
classification problem.
#for kali
#for windows
import shutil
try:
shutil.rmtree('logs')
except:
pass
#for windows
import shutil
try:
shutil.rmtree('logsx')
except:
pass
Reload TensorBoard.
%reload_ext tensorboard
230
## Create hyperparameters
1, 0.0005, 0.0001]))
'rmsprop']))
METRIC_ACCURACY='accuracy'
with tf.summary.create_file_writer(log_dir).as_default():
hp.hparams_config(
hparams=
E],
metrics=[hp.Metric(METRIC_ACCURACY, display_name='Accurac
y')],
Fit the models and include the log for metrics and
hyperparameters.
def create_model(hparams):
model = keras.models.Sequential()
model.add(Dense(3, activation='softmax'))
optimizer = hparams[HP_OPTIMIZER]
learning_rate = hparams[HP_LEARNING_RATE]
if optimizer == "adam":
231
optimizer = tf.optimizers.SGD(learning_rate=learning_ra
te)
elif optimizer=='rmsprop':
optimizer = tf.optimizers.RMSprop(learning_rate=learnin
g_rate)
else:
imizer_name,))
# Comiple the mode with the optimizer and learninf rate spe
cified in hparams
model.compile(optimizer=optimizer,
loss='categorical_crossentropy',
metrics=['accuracy'])
return accuracy
with tf.summary.create_file_writer(run_dir).as_default():
accuracy = create_model(hparams)
#converting to tf scalar
mpy()
232
session_num = 0
hparams = {
HP_NUM_UNITS: num_units,
HP_DROPOUT: dropout_rate,
HP_OPTIMIZER: optimizer,
HP_LEARNING_RATE: learning_rate,
session_num += 1
upper pane.
You can view the results from Table View which shows the
corresponding accuracy.
233
234
TensorFlow Profiler
235
Y%m%d-%H%M%S')
callbacks = [tf.keras.callbacks.TensorBoard(log_dir=log_dir, pr
ofile_batch='10,50')]
iris = datasets.load_iris()
model = create_model()
llbacks=callbacks)
Load TensorBoard and go to Profile in the dialog box to view
236
Overview page
237
Device Compute.
TF Op Placement.
238
The Step Time graph displays the device step time over all the
239
suggestions on how to improve your pipeline. The
implemented.
Trace viewer
240
The Trace Viewer is designed such that:
To the left (vertical grey column), you can see two major
or CPU resp.).
241
Trace Viewer makes it easy to understand the performance
242
shortcut S. A and D to move to the left and right, respectively.
Viewer window.
243
You can use your mouse to select multiple events and analyze
244
Input pipeline analyzer
245
Input Op statistics.
246
processes, including:
Compilation
Output
Input
Kernel launch
Host compute
Device to device
247
Host-side analysis details provide information on the
contained includes:
Enqueuing data
Data preprocessing
shown below.
Deep learning with TensorFlow and Keras
248
Deep learning with TensorFlow and Keras
249
statistics.
operation.
Count – number of instances of the operation execution
250
corresponding instance.
on each instance.
251
TensorFlow stats
pie charts.
252
The plot to the left shows the distribution of the total self-
execution time of each operation on the host, while the last plot
253
The TensorFlow statistics can be filtered by IDLE time from the
254
Other statistics included in the TensorFlow stats dashboard
given operations.
255
GPU kernel stats
If the host device runs with a TPU or GPU kernel, you can view
accelerated kernel.
256
Memory profile page
milliseconds.
profiling interval.
TensorBoard
begins training.
rftime("%Y%m%d-%H%M%S"))
tf.debugging.experimental.enable_dump_debug_info(
258
Load TensorBoard.
learning frameworks
frameworks.
TensorBoard in PyTorch
259
ort
rm -rf ./runs/
Next, define the data and model, and write the metrics to
#install torch
import torch
#data
x = torch.arange(-5, 5, 0.1).view(-1, 1)
y = -5 * x + 0.1 * torch.randn(x.size())
#model
model = torch.nn.Linear(1, 1)
criterion = torch.nn.MSELoss()
def train_model(iter):
y1 = model(x)
loss = criterion(y1, y)
optimizer.zero_grad()
loss.backward()
optimizer.step()
train_model(10)
writer.flush()
260
#close writer
writer.close()
import numpy as np
writer = SummaryWriter()
ter)
er)
Load TensorBoard.
%tensorboard --logdir=runs
261
TensorBoard in Keras
the experiment folder inside the folder containing the main logs.
tb_callback = tf.keras.callbacks.TensorBoard(log_dir="logs/expe
riment", histogram_freq=1)
model = create_model()
262
TensorBoard in XGBoost
263
notebook.
model.
import datetime
import os
'sampling_method']
class TensorBoardCallback(xgb.callback.TrainingCallback):
'''
Run experiments while scoring the model and saving the erro
'''
None):
self.datetime_ = datetime.datetime.now().strftime("%Y%
264
m%d-%H%M%S")
self.log_dir = f"runs/{self.experiment}/{self.datetime
_}"
self.train_writer = SummaryWriter(log_dir=os.path.join
(self.log_dir, "train/"))
if self.data_name:
self.test_writer = SummaryWriter(
log_dir=os.path.join(self.log_dir, f"{self.data
_name}/")
def after_iteration(
ngCallback.EvalsLog
) -> bool:
if not evals_log:
return False
e) else log[-1]
if data == "train":
self.train_writer.add_scalar(metric_name, s
core, epoch)
else:
self.test_writer.add_scalar(metric_name, sc
ore, epoch)
return False
X, y = fetch_openml(name="house_prices", return_X_y=True)
loat64']
X = X.select_dtypes(include=numerics)
size=0.2, random_state=100)
265
= True)
True)
e'}
bst = xgb.train(params, dtrain, num_boost_round=100, evals=[(dt
callbacks=[TensorBoardCallback(experiment='exp_1', data
_name='test')])
to SummaryWriter .
266
TensorBoard in JAX and Flax
You can log evaluation metrics when using JAX during model
import torchvision.transforms.functional as F
log_folder = "runs"
writer = SummaryWriter(logdir)
267
oader)
training_loss.append(train_metrics['loss'])
training_accuracy.append(train_metrics['accuracy'])
t_labels)
testing_loss.append(test_metrics['loss'])
testing_accuracy.append(test_metrics['accuracy'])
h)
y'], epoch)
writer.add_scalar('Accuracy/test', test_metrics['accurac
y'], epoch)
268
Pandas DataFrame
import tensorboard as tb
experiment_id = "c1KCv3X3QvGwaXfgX1c4tg"
experiment = tb.data.experimental.ExperimentFromDev(experiment_
id)
df = experiment.get_scalars()
df.head()
You can also obtain the DataFrame as a wide format since, in the
experiment, the two tags ( epoch_loss and epoch_accuracy )
269
try:
experiment_id = "c1KCv3X3QvGwaXfgX1c4tg"
experiment = tb.data.experimental.ExperimentFromDev(experim
ent_id)
df = experiment.get_scalars()
df_wide = experiment.get_scalars(pivot=True)
display(df_wide.head())
except:
df_wide = experiment.get_scalars(pivot=False)
display(df_wide.head())
#path
import pandas as pd
csv_path = 'tensor_experiment_1.csv'
df_wide.to_csv(csv_path, index=False)
df_wide_roundtrip = pd.read_csv(csv_path)
pd.testing.assert_frame_equal(df_wide_roundtrip, df_wide)
such as Matplotlib.
270
Tensorboard.dev
On Jupyter Notebook:
271
You will be prompted to continue with the upload by entering
command prompt.
Limitations of using
TensorBoard
of TensorBoard include:
272
Final thoughts
process.
273
Functional API
used to design networks that are not linear. In this article, you
networks that:
Are non-linear.
Share layers.
straightforward.
model = keras.Sequential(
274
layers.Conv2D(32, kernel_size=(parameters["kernel_size"],
parameters["kernel_size"]), input_shape =(parameters["shape"],
parameters["shape"], 1),activation=parameters["activation"]),
layers.MaxPooling2D(pool_size=(parameters["pool_size"], p
arameters["pool_size"])),
layers.Conv2D(64, kernel_size=(parameters["kernel_size"],
parameters["kernel_size"]), activation=parameters["activatio
n"]),
layers.MaxPooling2D(pool_size=(parameters["pool_size"], p
arameters["pool_size"])),
layers.Flatten(),
layers.Dropout(parameters["dropout"]),
layers.Dense(parameters["classes"], activation="softma
x"),
The Sequential API limits you to one input and one output.
keras.utils.plot_model(model, "model.png",show_shapes=True)
275
Keras Functional models
Defining input
276
With the Sequential API, you don't have to define the input
inputs.shape
inputs.dtype
# tf.float32
The input layer is defined without the batch size if the data is
one-dimensional.
inputs = keras.Input(shape=(784,))
Connecting layers
277
conv2D = layers.Conv2D(32)
x = conv2D(inputs)
by layer 'conv2d_7')>
tensor.
278
conv2D
by layer 'conv2d_8')>
Functional API.
maxPooling2D = layers.MaxPooling2D(pool_size=(parameters["pool_
size"], parameters["pool_size"]))(conv2D)
maxPooling2D_2 = layers.MaxPooling2D(pool_size=(parameters["poo
l_size"], parameters["pool_size"]))(conv2D_2)
flatten = layers.Flatten()(maxPooling2D_2)
dropout = layers.Dropout(parameters["dropout"])(flatten)
279
A Keras model is created using the keras.Model function while passing the
inputs and outputs .
_model")
We can plot the model to confirm that it's similar to the one we
defined using the Sequential API.
keras.utils.plot_model(model, "model.png",show_shapes=True)
280
Training and evaluation of
281
d_data()
model.compile(
loss=keras.losses.SparseCategoricalCrossentropy(from_logits
=True),
optimizer=keras.optimizers.RMSprop(),
metrics=["accuracy"],
validation_split=0.2)
API models
model.save("saved_model")
del model
282
model = keras.models.load_model("saved_model")
model.summary()
model
seq_model = keras.models.Sequential()
seq_model.add(layer)
seq_model.summary()
model
models.
inputs = keras.Input(batch_shape=seq_model.layers[0].input_shap
e)
284
x = inputs
x = layer(x)
outputs = x
func_model.summary()
Multilayer perception
285
dense1 = layers.Dense(128)(inputs)
dropout = layers.Dropout(parameters["dropout"])(dense1)
dense2 = layers.Dense(128)(dropout)
dropout1 = layers.Dropout(parameters["dropout"])(dense2)
_model")
keras.utils.plot_model(model, "model.png",show_shapes=True)
Deep learning with TensorFlow and Keras
286
287
maxPooling2D = layers.MaxPooling2D(pool_size=(parameters["pool_
size"], parameters["pool_size"]))(conv2D)
maxPooling2D_2 = layers.MaxPooling2D(pool_size=(parameters["poo
l_size"], parameters["pool_size"]))(conv2D_2)
flatten = layers.Flatten()(maxPooling2D_2)
dropout = layers.Dropout(parameters["dropout"])(flatten)
_model")
keras.utils.plot_model(model, "model.png",show_shapes=True)
Deep learning with TensorFlow and Keras
288
Recurrent Neural Network
inputs = keras.Input(784,)
s)
289
quences=True))(embedding)
bidirectional2 = layers.Bidirectional(layers.LSTM(64,))(bidirec
tional1)
keras.utils.plot_model(model, "model.png",show_shapes=True)
290
This example defines a CNN with one input layer shared by two
maxPooling2D = layers.MaxPooling2D(pool_size=(parameters["pool_
size"], parameters["pool_size"]))(conv2D)
flatten1 = layers.Flatten()(maxPooling2D)
conv2D_2 = layers.Conv2D(64, kernel_size=(parameters["kernel_si
maxPooling2D_2 = layers.MaxPooling2D(pool_size=(parameters["poo
l_size"], parameters["pool_size"]))(conv2D_2)
flatten2 = layers.Flatten()(maxPooling2D_2)
# merge layers
dropout = layers.Dropout(parameters["dropout"])(merged_layers)
291
model = keras.Model(inputs=inputs, outputs=outputs, name="mnist
_model")
keras.utils.plot_model(model, "model.png",show_shapes=True)
different layers.
292
s)
quences=True))(embedding)
quences=True))(embedding)
# merge layers
al2])
dense1 = layers.Dense(32, activation='relu')(merged_layers)
model")
keras.utils.plot_model(model, "model.png",show_shapes=True)
models
293
Sequential API.
Multiple input model
input1 = keras.Input(shape=(16,))
x1 =layers.Dense(8, activation='relu')(input1)
input2 = layers.Input(shape=(32,))
x2 = layers.Dense(8, activation='relu')(input2)
keras.utils.plot_model(model, "model.png",show_shapes=True)
294
x = layers.Conv2D(filters=32,kernel_size=(3,3),activation='rel
u')(image_input)
x = layers.MaxPooling2D(pool_size=(2,2))(x)
x = layers.Conv2D(filters=32,kernel_size=(3,3), activation='rel
u')(x)
x = layers.Dropout(0.25)(x)
x = layers.Conv2D(filters=64,kernel_size=(3,3), activation='rel
u')(x)
x = layers.MaxPooling2D(pool_size=(2,2))(x)
x = layers.Dropout(0.25)(x)
x = layers.Flatten()(x)
x = layers.Dense(128, activation='relu')(x)
x = layers.Dropout(0.25)(x)
model = keras.Model(
inputs=image_input,
outputs=[gender_prediction, age_prediction],
)
keras.utils.plot_model(model, "model.png",show_shapes=True)
295
Use the same graph of layers to
296
x = layers.Conv2D(32, 3, activation="relu")(x)
x = layers.MaxPooling2D(3)(x)
x = layers.Conv2D(32, 3, activation="relu")(x)
x = layers.Conv2D(16, 3, activation="relu")(x)
encoder_output = layers.GlobalMaxPooling2D()(x)
der")
encoder.summary()
x = layers.Reshape((4, 4, 1))(encoder_output)
x = layers.Conv2DTranspose(16, 3, activation="relu")(x)
x = layers.Conv2DTranspose(32, 3, activation="relu")(x)
x = layers.UpSampling2D(3)(x)
x = layers.Conv2DTranspose(16, 3, activation="relu")(x)
u")(x)
="autoencoder")
autoencoder.summary()
end example
Age
Hair color
Mustache color
297
Eye color
Data download
an assets folder.
Data processing
Next, we load the JSON file using Pandas. The file contains the
classification for each image.
import pandas as pd
df = pd.read_json("labels-en.json")
df.head()
298
Add image path column
We will load the images using the Pandas DataFrame. To do that, we need
to provide a path for each image. Let's add the
def image_names(externalId):
return f"{externalId}.png"
df["image_path"] = df["externalId"].map(image_names)
df.tail()
list.
age = []
hair = []
beard = []
mustache = []
eye = []
def get_answers(tasks):
all_it = all_tasks[0]
299
if item['title'] == 'Age':
age.append(item['answer'])
hair.append(item['answer'])
beard.append(item['answer'])
mustache.append(item['answer'])
eye.append(item['answer'])
Next, we use these lists to create a new column for each face
attribute.
get_answers(df['tasks'])
df['age'] = age
df['hair_color'] = hair
df['beard_color'] = beard
df['mustache_color'] = mustache
df['eye_color'] = eye
300
Label encoding
We now have the labels for each face attribute. The next step is
age_labelencoder = LabelEncoder()
hair_labelencoder = LabelEncoder()
beard_labelencoder = LabelEncoder()
mustache_labelencoder = LabelEncoder()
eye_labelencoder = LabelEncoder()
df = df.assign(age = age_labelencoder.fit_transform(df["age"]))
df = df.assign(hair_color = hair_labelencoder.fit_transform(df
["hair_color"]))
df = df.assign(beard_color = beard_labelencoder.fit_transform(d
f["beard_color"]))
df = df.assign(mustache_color = mustache_labelencoder.fit_trans
form(df["mustache_color"]))
df = df.assign(age = age_labelencoder.fit_transform(df["eye_col
or"]))
df = df.assign(eye_color = eye_labelencoder.fit_transform(df["e
ye_color"]))
301
Generate tf.data dataset
case:
and shear_range .
train_datagen = ImageDataGenerator(rescale=1./255,
shear_range=0.2,
302
zoom_range=0.2,
horizontal_flip=True,
width_shift_range=0.1,
height_shift_range=0.1,
validation_split=0.2
validation_gen = ImageDataGenerator(rescale=1./255,validation_s
plit=0.2)
size.
batch_size = 32
base_dir = 'assets'
target_columns = ['age','hair_color','beard_color','mustache_co
lor','eye_color']
training_set = train_datagen.flow_from_dataframe(df,base_dir,
seed=101,
target_size=ima
ge_size,
batch_size=batc
h_size,
x_col='image_pa
303
th',
y_col=target_co
lumns,
subset = 'train
ing',
class_mode='mul
ti_output')
validation_set = validation_gen.flow_from_dataframe(df,base_di
r,
target_size=image
_size,
batch_size=batch_
size,
x_col='image_pat
h',
y_col=target_co
lumns,
subset = 'validat
ion',
class_mode='multi
_output'
304
import matplotlib.pyplot as plt
plt.figure(figsize=(10, 10))
for i in range(25):
ax = plt.subplot(5, 5, i + 1)
plt.imshow(images[i])
plt.axis("off")
break
305
network
sum the output from each block. After that, we create five
dense layers that will produce the prediction for each of the
softmax activation for each layer because the classes are more
than two.
the summary and plot of the network. It will also make it easier
to evaluate the performance of the network.
latten,Dropout,Resizing
ator
306
x = Conv2D(filters=32,kernel_size=(3,3),activation="relu",name
="first_block_conv2d")(image_input)
x = MaxPooling2D(pool_size=(2,2),name="first_block_maxpool2d")
(x)
first_block_output = Flatten(name="first_block_flatten")(x)
e="second_block_conv2d")(image_input)
x = MaxPooling2D(pool_size=(2,2),name="second_block_maxpool2d")
(x)
x = Flatten(name="second_block_flatten")(x)
="second_block_add")
x = Conv2D(filters=32,kernel_size=(3,3), activation='relu',name
="third_block_conv2d")(image_input)
x = MaxPooling2D(pool_size=(2,2),name="third_block_maxpool2d")
(x)
x = Flatten(name="third_block_flatten")(x)
="third_block_add")
x = Dropout(0.25, name="dropout1")(third_block_output)
x",name="dense_age")(x)
="softmax",name="dense_hair")(x)
n="softmax",name="dense_beard")(x)
307
required.
model = keras.Model(
inputs=image_input,
outputs=[ age_prediction,
hair_prediction,beard_prediction,
mustache_prediction,eye_prediction
],
)
Plot and inspect the Keras
Functional model
intended.
keras.utils.plot_model(model, "model.png",show_shapes=True)
308
network
model.compile(optimizer='adam', loss=keras.losses.SparseCategor
icalCrossentropy(), metrics=keras.metrics.SparseCategoricalAccu
racy())
Let's train this network with the Earlystopping callback and 100
epochs. We stop training if the network doesn't improve for 10
trains.
epochs=100
history = model.fit(training_set,validation_data=validation_se
309
network
attributes.
310
The history variable contains the metrics for the five attributes.
metrics_df = pd.DataFrame(history.history)
311
metrics_df[["dense_hair_sparse_categorical_accuracy","val_dense
_hair_sparse_categorical_accuracy"]].plot()
better results.
Functional model
312
image_url = "https://storage.googleapis.com/ango-covid-dataset/
ffhq-dataset/batch2/25000.png"
test_image = keras.utils.load_img(
plt.axis("off")
plt.imshow(test_image);
313
img_array = tf.keras.utils.img_to_array(test_image)
img_array = tf.expand_dims(img_array, 0)
predictions = model.predict(img_array)
predictions
age_labelencoder.classes_
'Other'],
# dtype=object)
314
Let's bundle all this into a function that will receive an image URL and do
the following:
Convert it to NumPy.
def make_face_prediction(image_url):
import tensorflow as tf
import numpy as np
test_image = keras.utils.load_img(
img_array = tf.keras.utils.img_to_array(test_image)
img_array = tf.expand_dims(img_array, 0)
predictions = model.predict(img_array)
age_predictions = predictions[0][0]
hair_predictions = predictions[1][0]
beard_predictions = predictions[2][0]
mustache_predictions = predictions[3][0]
eye_predictions = predictions[4][0]
age_scores = tf.nn.softmax(age_predictions).numpy()
hair_scores = tf.nn.softmax(hair_predictions).numpy()
beard_scores = tf.nn.softmax(beard_predictions).numpy()
mustache_scores = tf.nn.softmax(mustache_predictions).numpy
()
eye_scores = tf.nn.softmax(eye_predictions).numpy()
print(f"Age: {list(age_labelencoder.classes_)[np.argmax(age
nt confidence.")
percent confidence.")
make_face_prediction('https://storage.googleapis.com/ango-covid
-dataset/ffhq-dataset/batch2/25000.png')
316
and weaknesses
Functional API:
317
example Conv2D(...,name="first_conv_layer") .
summary.
Final thoughts
In this article, you have discovered that you can design neural
have covered:
318
However, you may want more control over the training process.
gradient.
This article will walk you through the process of doing that.
Obtain dataset
We'll use the Fashion MNIST dataset for this illustration and load it using
the Layer data loader.
import layer
mnist_train = layer.get_dataset('layer/fashion_mnist/datasets/f
ashion_mnist_train').to_pandas()
mnist_test = layer.get_dataset('layer/fashion_mnist/datasets/fa
shion_mnist_test').to_pandas()
319
mnist_train["images"][17]
mnist_test["images"][23]
Data processing
import numpy as np
def images_to_np_array(image_column):
return np.array([np.array(im.getdata()).reshape((im.size
train_images = images_to_np_array(mnist_train.images)
test_images = images_to_np_array(mnist_test.images)
train_labels = mnist_train.labels
test_labels = mnist_test.labels
320
image data.
train_images.shape
train_images.shape
321
Next, let's define the number of images that will be passed to the network.
32 is a common choice, but this number can be
ds_train_batch = tf.data.Dataset.from_tensor_slices((train_imag
es, train_labels))
training_data = ds_train_batch.batch(32)
ds_test_batch = tf.data.Dataset.from_tensor_slices((test_image
s, test_labels))
testing_data = ds_test_batch.batch(32)
class MyDenseLayer(tf.keras.layers.Layer):
super(MyDenseLayer, self).__init__()
self.num_outputs = num_outputs
self.kernel = self.add_weight("kernel",
shape=[int(input_shape[-1]),
self.num_outputs])
322
layer = MyDenseLayer(10)
Convolution.
Max pooling.
Flatten.
DropOut.
Dense.
class CustomBlock(tf.keras.Model):
super(CustomBlock, self).__init__(name='')
self.conv2a = layers.Conv2D(filters=filters1,input_shape=(2
self.maxpool1a = layers.MaxPooling2D(pool_size=(parameters
["pool_size"], parameters["pool_size"]))
323
self.maxpool2b = layers.MaxPooling2D(pool_size=(parameters
["pool_size"], parameters["pool_size"]))
self.flatten1a = layers.Flatten()
self.dropout1a = layers.Dropout(parameters["dropout"])
on="softmax")
x = self.conv2a(input_tensor)
x = tf.nn.relu(x)
x = self.maxpool1a(x)
x = self.conv2b(x)
x = tf.nn.relu(x)
x = self.maxpool2b(x)
x = self.flatten1a(x)
x = self.dropout1a(x)
x = self.dense1a(x)
return tf.nn.softmax(x)
Let's initialize the model and check the layers and variables.
model = CustomBlock([32,64])
x = tf.random.normal(input_shape)
_ = model(x)
x.shape
model.layers
len(model.variables)
324
325
The model can be used to make predictions even before
326
instead. The goal is to reduce the errors between the true and
loss_object = tf.keras.losses.SparseCategoricalCrossentropy()
ifferent
# behavior during training versus inference (e.g. Dropout).
y_ = model(x, training=training)
327
backward pass.
variables)
Create an optimizer
optimizer = tf.keras.optimizers.Adam()
The training loop feeds the training images to the network while
328
train_loss_results = []
train_accuracy_results = []
num_epochs = 10
epoch_loss_avg = tf.keras.metrics.Mean()
epoch_accuracy = tf.keras.metrics.SparseCategoricalAccuracy()
for x, y in training_data:
optimizer.apply_gradients(zip(grads, model.trainable_variab
les))
# Track progress
h loss
ferent
# End epoch
train_loss_results.append(epoch_loss_avg.result())
train_accuracy_results.append(epoch_accuracy.result())
+ 1,
epoch_loss_avg.result(),
epoch_accuracy.result()))
axes[0].set_ylabel("Loss", fontsize=14)
axes[0].plot(train_loss_results)
axes[1].set_ylabel("Accuracy", fontsize=14)
axes[1].set_xlabel("Epoch", fontsize=14)
axes[1].plot(train_accuracy_results)
plt.show()
330
test data, make predictions and compare them with the true
value.
test_accuracy = tf.keras.metrics.Accuracy()
erent
331
test_accuracy(prediction, y)
()))
predictions
ent
class_names = ["T-shirt/top","Trouser","Pullover","Dress","Coa
t","Sandal","Shirt","Sneaker","Bag","Ankle boot"]
class_idx = tf.math.argmax(logits).numpy()
p = tf.nn.softmax(logits)[class_idx]
name = class_names[class_idx]
0*p))
Final thoughts
332
You have now learned to create custom layers and training
processing that happens when you call the fit method from
built-in layers.
Creating the custom loop function that utilizes the loss and
gradient functions.
333
The next time you train a deep learning model, the training will be on
the GPU by default.
on Apple Silicon
to:
Video editing.
334
performance. https://developer.apple.com/documentatio
TensorFlow
metal PluggableDevice.
Install Tensorflow-metal
PluggableDevice
Deep learning with TensorFlow and Keras
335
chmod +x ~/Downloads/Miniforge3-MacOSX-arm64.sh
sh ~/Downloads/Miniforge3-MacOSX-arm64.sh
source ~/miniforge3/bin/activate
Install TensorFlow:
PyTorch
As of this writing, you must install the Preview (Nightly) build to train the
PyTorch model on Apple Silicon GPUs. This will be
device = torch.device("mps")
337
Final thoughts
API
338
In this article, we'll use the Coco Car Damage Detection
cars and car parts. The dataset has already been annotated,
detection
If you have a custom dataset you'd like to use, then you have to
and online platforms that can help you achieve this. If you
it. Once you save it, Labelme will store the resulting JSON file
339
If you are looking for an online tool, here are some platforms
images.
Segments AI lists some object detection and image segmentation datasets
that you can clone into your
340
Detection API?
CenterNet
EfficientDet
SSD MobileNet
SSD ResNet
Faster R-CNN
ExtremeNet
Mask RCNN
project, we'll use the Mask RCNN model, but you can also try
341
the car images data and the corresponding COCO JSON files
somewhere.
on Colab:
%%bash
cd models/research
# Compile protos.
cp object_detection/packages/tf2/setup.py .
342
rfile -t od .
successfully.
import matplotlib
import os
import random
import io
import imageio
import glob
import scipy.misc
import numpy as np
import tensorflow as tf
import pathlib
import itertools
import random
343
tils
%matplotlib inline
dataset
The dataset and the config file for the model we'll be training
some changes after you download the config from the object
The next step is to download the Mask R-CNN model that we'll
fine tune. Extract the file to get the trained model checkpoint.
import wget
model_link = "http://download.tensorflow.org/models/object_dete
ction/tf2/20200711/mask_rcnn_inception_resnet_v2_1024x1024_coco
17_gpu-8.tar.gz"
wget.download(model_link)
344
import tarfile
tar = tarfile.open('/content/mask_rcnn_inception_resnet_v2_1024
x1024_coco17_gpu-8.tar.gz')
tar.extractall('.')
tar.close()
file. You will always have to edit this file after downloading each
model.
Let's look at the items in the configuration file that you need to
update.
345
The config file you'll get after cloning this repo has been edited
here are the items you need to update after downloading the
API repo:
and rear_bumper .
The more the steps the longer it will take to train the model.
You can increase the steps if the loss is still decreasing and
346
TFRecords
stored.
r\
--train_image_dir=/content/maskrcnn/data/train \
--test_image_dir=/content/maskrcnn/data/val \
--train_annotations_file=/content/maskrcnn/data/train/COC
O_mul_train_annos.json \
--test_annotations_file=/content/maskrcnn/data/val/COCO_m
ul_val_annos.json \
--output_dir=/content/maskrcnn/data/tf_records
347
You now have everything you need to train this Mask R-CNN
configuration file.
saved.
!python /content/models/research/object_detection/model_main_tf
2.py\
--pipeline_config_path=/content/maskrcnn/mask_rcnn_inceptio
n_resnet_v2_1024x1024_coco17_gpu-8-colab.config \
--model_dir=/content/training \
--alsologtostderr
348
You might get an OpenCV error on Colab. This error can be fixed by
installing the right version of
OpenCV.
If you get a cuDNN error, you can fix it by installing the right version of
cuDNN.
cuda11.2
visualization
349
%load_ext tensorboard
model.
file.
The conversion script will output checkpoint files,
350
Colab
You may want to download the converted model or trained
model. That can be done by zipping the files and using Colab
351
nn
files.download("/content/maskrcnn.zip")
files.download("/content/fine_tuned_model.zip")
CNN
It's now time to use the trained Mask R-CNN model to perform
NumPy array
def load_image_into_numpy_array(path):
pe
Args:
Returns:
352
"""
image = Image.open(BytesIO(img_data))
return np.array(image.getdata()).reshape(
Visualize detections
The next utility is a function for plotting the detections
using Matplotlib.
def plot_detections(image_np,
boxes,
classes,
scores,
category_index,
figsize=(12, 16),
image_name=None):
Args:
th, 3)
s are 1-based,
e, then
es.
ch holding
tegory indices.
353
"""
image_np_with_annotations = image_np.copy()
viz_utils.visualize_boxes_and_labels_on_image_array(
image_np_with_annotations,
boxes,
classes,
scores,
category_index,
use_normalized_coordinates=True,
min_score_thresh=0.8)
if image_name:
plt.imsave(image_name, image_np_with_annotations)
else:
plt.imshow(image_np_with_annotations)
checkpoint
Let's now create a detection model from the last saved model
checkpoint.
filenames = list(pathlib.Path('/content/training/').glob('*.ind
ex'))
filenames.sort()
print(filenames)
model_dir = '/content/training/'
#generally you want to put the last ckpt from training in here
configs = config_util.get_configs_from_pipeline_file(pipeline_f
ile)
model_config = configs['model']
detection_model = model_builder.build(
model_config=model_config, is_training=False)
354
# Restore checkpoint
ckpt = tf.compat.v2.train.Checkpoint(
model=detection_model)
ckpt.restore(os.path.join(str(filenames[-1]).replace('.inde
x','')))
def get_model_detection_function(model):
@tf.function
def detect_fn(image):
1])
return detect_fn
detect_fn = get_model_detection_function(detection_model
Map labels for inference
decoding
label_map_path = configs['eval_input_config'].label_map_path
label_map = label_map_util.load_labelmap(label_map_path)
355
categories = label_map_util.convert_label_map_to_categories(
label_map,
max_num_classes=label_map_util.get_max_label_map_index(labe
l_map),
use_display_name=True)
category_index = label_map_util.create_category_index(categorie
s)
label_map_dict = label_map_util.get_label_map_dict(label_map, u
se_display_name=True)
#it takes a little longer on the first run and then runs at nor
mal speed.
TEST_IMAGE_PATHS = glob.glob('/content/maskrcnn/data/test/*.jp
g')
image_path = random.choice(TEST_IMAGE_PATHS)
image_np = load_image_into_numpy_array(image_path)
input_tensor = tf.convert_to_tensor(
label_id_offset = 1
image_np_with_detections = image_np.copy()
viz_utils.visualize_boxes_and_labels_on_image_array(
image_np_with_detections,
detections['detection_boxes'][0].numpy(),
(detections['detection_classes'][0].numpy() + label_id_of
fset).astype(int),
detections['detection_scores'][0].numpy(),
category_index,
use_normalized_coordinates=True,
356
max_boxes_to_draw=200,
min_score_thresh=.5,
agnostic_mode=False,
)
plt.figure(figsize=(12,16))
plt.imshow(image_np_with_detections)
plt.axis("off")
plt.show()
R-CNN
357
The Mask R-CNN object detection model can be used for both
def load_model(model_dir):
model = tf.saved_model.load(str(model_dir))
return model
model_dir = '/content/finetuned-maskrcnn/saved_model'
masking_model = load_model(model_dir)
# List of the strings that is used to add correct label for eac
h box.
PATH_TO_LABELS = '/content/maskrcnn/data/labelmap.pbtxt'
category_index = label_map_util.create_category_index_from_labe
lmap(PATH_TO_LABELS, use_display_name=True)
The next step is to define the path to the test images. In this
case, we'll use all the test images because they aren't that
many.
PATH_TO_TEST_IMAGES_DIR = pathlib.Path('/content/maskrcnn/data/
test')
358
TEST_IMAGE_PATHS =
sorted(list(PATH_TO_TEST_IMAGES_DIR.glob("*.
jpg")))
input_tensor = tf.convert_to_tensor(image_np)
`tf.newaxis`.
detections = model(input_tensor)
batch dimension.
num_detections = int(detections.pop("num_detections"))
detections))
detections["num_detections"] = num_detections
image_np_with_detections = image_np.copy()
if "detection_masks" in detections:
359
detection_masks_reframed = utils_ops.reframe_box_masks_
to_image_masks(
detections["detection_masks"][0], detections["det
ection_boxes"][0],
image_np.shape[0], image_np.shape[1])
detection_masks_reframed = tf.cast(detection_masks_refr
tf.uint8)
detections["detection_masks_reframed"] = detection_mask
s_reframed.numpy()
boxes = np.asarray(detections["detection_boxes"][0])
classes = np.asarray(detections["detection_classes"][0]).as
type(np.int64)
scores = np.asarray(detections["detection_scores"][0])
mask = np.asarray(detections["detection_masks_reframed"])
viz_utils.visualize_boxes_and_labels_on_image_array(
image_np_with_detections,
boxes,
classes,
scores,
category_index,
instance_masks=mask,
use_normalized_coordinates=True,
line_thickness=3)
display(Image.fromarray(image_np_with_detections))
detection
360
def show_inference(model, image_path):
# Load image
image_np = np.array(Image.open(image_path))
# Actual detection.
p)
show_inference(masking_model, image_path)
Final thoughts
361
detection model.
Click the Colab link to try the project from start to finish. You can also
replace the dataset with another one. If you change
Appendix
This book is provided in line with our terms and privacy policy.
Disclaimer
362
a production
The author has made every effort to ensure the accuracy of the
information within
hereby disclaims any liability to any party for any loss, damage,
or disruption caused
from accident,
form or by any
Copyright
363
Learn Python
Learn Streamlit
364