0% found this document useful (0 votes)
27 views

Intro To Deep Learning With TensorFlow - Introduction To TensorFlow Cheatsheet - Codecademy

The document discusses preprocessing data for deep learning, including one-hot encoding of categorical features, splitting data into train and test sets, scaling data, creating and compiling sequential deep learning models, fitting models to training data and evaluating on test data.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

Intro To Deep Learning With TensorFlow - Introduction To TensorFlow Cheatsheet - Codecademy

The document discusses preprocessing data for deep learning, including one-hot encoding of categorical features, splitting data into train and test sets, scaling data, creating and compiling sequential deep learning models, fitting models to training data and evaluating on test data.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

25/4/24, 05:52 Intro to Deep Learning with TensorFlow: Introduction to TensorFlow Cheatsheet | Codecademy

Cheatsheets / Intro to Deep Learning with TensorFlow

Introduction to TensorFlow

One-Hot Encoding with Python

When working with nominal categorical variables in df = pd.get_dummies(data = df, columns=


Python, it can be useful to use One-Hot Encoding,
['column1', 'column2')
which is a technique that will effectively create binary
variables for each of the nominal categories. This
encodes the variable without creating an order among
the categories. To one-hot encode a variable in a
pandas dataframe, we can use the .get_dummies() .

Exploring Data for Deep Learning

Before diving into your deep learning, it is best practice #load the dataset
to investigate your dataset to get acquainted with the
dataset = pd.read_csv('dataset.csv')
features, size, and structure of the information you are
working with. You can investigate your data with pandas, #choose first 7 columns as features
using properties such as .shape and methods like features = dataset.iloc[:,0:6]
.describe() .
#choose the final column for prediction
Neural networks cannot work with string data.
Therefore, if upon inspection you find that your data labels = dataset.iloc[:,-1]
contains strings, you can use one hot encoding to #see useful summary statistics for
convert categorical features into numerical features. An
numeric features
example of this is pictured below. To do this in Python,
you can use the .get_dummies() pandas method. print(features.describe())
#shape and summary statistics of labels
print(labels.shape)
print(labels.describe())

# use one hot encoding


numerical_features =
pd.get_dummies(features)

https://www.codecademy.com/learn/intro-to-deep-learning-with-tensor-flow/modules/intro-to-tensorflow/cheatsheet 1/8
25/4/24, 05:52 Intro to Deep Learning with TensorFlow: Introduction to TensorFlow Cheatsheet | Codecademy

Train and Test Sets

When training a deep learning model (or any other from sklearn.model_selection import
machine learning model), split your data into train and
train_test_split
test sets. The train set is used during the learning
process, while the test set is used to evaluate the
results of your model. # Here we chose the test size to be 33%
To perform this in Python, we use the
of the total data, and random state
train_test_split() method from the scikit-learn
library. controls the shuffling applied to the
data before applying the split.

features_train, features_test,
labels_train, labels_test =
train_test_split(features, labels,
test_size=0.33, random_state=42)

Scaling Your Data

When preprocessing our data, we want to make sure all # Standardization can be implemented in
our features have similar scales. This is because deep
the following way with scikit-learn:
learning models (like all learning models) perform
better if all our features are weighed equally. from sklearn.preprocessing import
Standardization and normalization are both common StandardScaler
scaling methods.
from sklearn.compose import
Standardization scales all the features to have a mean
of zero and a unit variance (equal to one). ColumnTransformer
Normalization scales all the features to be in a fixed
range, normally between 0 and 1 . Both are viable
ct = ColumnTransformer([(“scale”,
options when getting your data prepared for the
learning process. StandardScaler(), ['age', 'bmi',
'children'])], remainder='passthrough')
features_train =
ct.fit_transform(features_train)
features_test =
ct.transform(features_test)

https://www.codecademy.com/learn/intro-to-deep-learning-with-tensor-flow/modules/intro-to-tensorflow/cheatsheet 2/8
25/4/24, 05:52 Intro to Deep Learning with TensorFlow: Introduction to TensorFlow Cheatsheet | Codecademy

Sequential Model

A sequential deep learning model is a linear stack of from tensorflow.keras.models import


layers with one input layer where data enters the neural
Sequential
network and one ouput layer where data exits the
neural network. These stacked layers each contain at from tensorflow.keras import layers
least one neuron, and they are the building blocks of
our neural networks.
# initializing a sequential model
Here is an example layer diagram with three neurons.
The W and b labels in the diagram represent weights model = Sequential()
and bias.

# creating a layer with 3 neurons


layer = layers.Dense(3)

Optimizing Loss

When compiling a deep learning model, loss is # compiling our deep learning model with
measured to evaluate the success of the results. A
the following parameters:
lower loss means better performance. Since the goal is
to achieve the best performance possible (without # mean squared error as the loss function
overfitting or underfitting), optimizers are used to # mean average error as the metric
continuously update the weights and parameters and
# Adam as the optimizer -- a widely used
improve loss metrics.
In the case of regression, the most often used loss one
function is the Mean Squared Error mse (the average opt = Adam(learning_rate = 0.01)
squared difference between the estimated values and
my_model.compile(loss='mse', metrics=
the actual value).
Additionally, we want to observe the progress of the ['mae'], optimizer=opt)
Mean Absolute Error ( mae ) while training the model
because MAE can give us a better idea than mse on
how far off we are from the true values in the units we
are predicting.

https://www.codecademy.com/learn/intro-to-deep-learning-with-tensor-flow/modules/intro-to-tensorflow/cheatsheet 3/8
25/4/24, 05:52 Intro to Deep Learning with TensorFlow: Introduction to TensorFlow Cheatsheet | Codecademy

Fit and Evaluate Your Deep Learning Model

Once a deep learning model is compiled, it is time to fit # fiting our model
it to the training data and evaluate it on the test data.
my_model.fit(train_data, train_labels,
Using the .fit() scikit-learn method on the training
data, we specify the following parameters: epochs=50, batch_size=3, verbose=1)
the training set of the data
the true labels for the training set of data
# evaluating our model
epochs which is the number of cycles through
the full training dataset val_mse, val_mae =
batch_size which is the number of data points my_model.evaluate(test_data, test_labels,
to work through before updating the model
verbose = 0)
parameters
After we fit the model, we evaluate it using the
.evaluate() scikit-learn method on the test set of
data.

Input, Output, and Hidden Layers

In a sequential deep learning model, we have three from tensorflow.keras.layers import


different types of layers:
InputLayer
Input Layer: A placeholder for data to enter the
neural network from tensorflow.keras.layers import Dense
Output Layer: The final layer of the neural from tensorflow.keras.models import
network where results are outputted
Sequential
Hidden Layer: An intermediate layer that adds
more complexity and captures non-linear
interactions among inputs and outputs in a my_model = Sequential()
neural network
There is always only one input and output layer, while
there can be as many hidden layers as desired (even # adding an input layer for a dataframe
zero). Together, all these layers create neural networks with 15 columns
like the one shown here:
my_model.add(InputLayer(input_shape=
(15,)))

# hidden layer with 64 neurons and relu


activation function
my_model.add(Dense(64,
activation='relu'))

# adding an output layer to our model


my_model.add(Dense(1))

https://www.codecademy.com/learn/intro-to-deep-learning-with-tensor-flow/modules/intro-to-tensorflow/cheatsheet 4/8
25/4/24, 05:52 Intro to Deep Learning with TensorFlow: Introduction to TensorFlow Cheatsheet | Codecademy

Hyperparameter Tuning

After training and evaluating a neural network model,


one must start the process of hyperparameter tuning,
which involves tweaking hyperparameter values to
continuously improve results.
In the image, you’ll see how we use the three datasets
and our hyperparameters to adjust and evaluate our
model’s performance:
We use training data to adjust the weights and
biases of our model to change its fit.
We use validation data to evaluate the model’s
performance.
If the validation performance is good, we can
use our test data to check if our model still
performs well with a completely new set of
data.
If the validation performance isn’t good, we
tweak our hyperparameters before retraining
the model:
the learning rate
batch size
number of epochs
number of hidden layers
optimizer

Common Hyperparameters

When going through the process of hyperparamter


tuning, there are several common parameters to adjust:
learning rate: determines how big of a change is
applied to the weights as a consequence of the
error gradient calculated on a batch of training
data
batch size: determines how many training
samples are seen before updating the network’s
parameters (weight and bias matrices)
epochs: represents the number of complete
passes through the training dataset
layers: the number of hidden layers we decide
to put in our model
Tuning these hyperparameters is key to strong model
performance. Making slight changes to them can alter
performance in major ways, so hyperparameter tuning
is often the longest process of building a model.
While in the process of hyperparameter tuning for a
deep learning model, a good rule of thumb is to start by
adding one hidden layer and add as many parameters
as there are features existing in the dataset.

https://www.codecademy.com/learn/intro-to-deep-learning-with-tensor-flow/modules/intro-to-tensorflow/cheatsheet 5/8
25/4/24, 05:52 Intro to Deep Learning with TensorFlow: Introduction to TensorFlow Cheatsheet | Codecademy

Early Stopping

To avoid overfitting in a deep learning model, one can from tensorflow.keras.callbacks import
specify early stopping in TensorFlow with Keras by
EarlyStopping
creating an EarlyStopping callback and adding it as a
parameter when we fit our model. An implementation
of EarlyStopping is shown with the following: stop = EarlyStopping(monitor='val_loss',
monitor = val_loss , which means we are
mode='min', verbose=1, patience=40)
monitoring the validation loss to decide when to
stop the training
mode = min , which means we seek minimal history = model.fit(features_train,
loss
labels_train, epochs=num_epochs,
patience = 40 , which means that if the
batch_size=16, verbose=0,
learning reaches a plateau, it will continue for
40 more epochs in case the plateau leads to validation_split=0.2, callbacks=[stop])
improved performance

Grid Search for Deep Learning

When tuning a deep learning model, one can use grid model =
search, also called exhaustive search, to try every
KerasRegressor(build_fn=design_model)
combination of desired hyperparameter values.
If, for example, we want to try learning rates of 0.01 and
0.001 and batch sizes of 10, 30, and 50, grid search will # batch sizes and epochs to test
try six combinations of parameters (0.01 and 10, 0.01
batch_size = [4, 8, 16, 64]
and 30, 0.01 and 50, 0.001 and 10, and so on).
To implement this in Python, we use GridSearchCV epochs = [10, 50, 100, 200]
from scikit-learn. For regression, we need to first wrap # setting up our grid of parameters
our neural network model into a KerasRegressor .
param_grid = dict(batch_size=batch_size,
Then, we need to setup the desired hyperparameters
grid (we don’t use many values for the sake of speed). epochs=epochs)
Finally, we initialize a GridSearchCV object and fit
our model to the data. The implementation of this is
# initiliazing a grid search
shown in the code snippet.
grid = GridSearchCV(estimator = model,
param_grid=param_grid, scoring =
make_scorer(mean_squared_error,
greater_is_better=False))
# fitting the results
grid_result = grid.fit(features_train,
labels_train, verbose = 0)

https://www.codecademy.com/learn/intro-to-deep-learning-with-tensor-flow/modules/intro-to-tensorflow/cheatsheet 6/8
25/4/24, 05:52 Intro to Deep Learning with TensorFlow: Introduction to TensorFlow Cheatsheet | Codecademy

Random Search for Deep Learning

When tuning a deep learning model, one can use # parameter grid with batch sizes between
random search to go through random combinations of
2 and 16, and epochs between 10 and 100
hyperparameters over a specific interval.
Randomized search will sample values for batch_size param_grid = {'batch_size': sp_randint(2,
and nb_epoch from uniform distributions on specified 16), 'nb_epoch': sp_randint(10, 100)}
intervals. For example, in the code snippet shown, we
sample random batch sizes in the interval [2, 16] and
random epoch sizes in the interval [10, 100], # initializing random search
respectively, for a fixed number of iterations. In our # score is using mse as the metric and
case, 12 iterations:
looking for lower scores
# 12 iterations
grid = RandomizedSearchCV(estimator =
model, param_distributions=param_grid,
scoring = make_scorer(mean_squared_error,
greater_is_better=False), n_iter = 12)

Regularization and Dropout

Regularization is a set of techniques that help avoid # A model with two dropout layers
overfitting by preventing the learning process from
fitting a deep learning model completely.
Dropout is a regularization technique that randomly # setting up model and input layer
ignores, or “drops out”, a number of outputs of a layer model = Sequential()
by setting them to zeros.
my_input = tf.keras.Input(shape=(20,))
The dropout rate is the percentage of layer outputs set
to zero (usually between 20% to 50%). In Keras, we can model.add(my_input)
add a dropout layer by introducing the Dropout layer. model.add(layers.Dense(128,
activation='relu'))

# dropout layer with dropout rate of 0.1


model.add(layers.Dropout(0.1))

model.add(layers.Dense(64,
activation='relu'))

# dropout layer with dropout rate of 0.2


model.add(layers.Dropout(0.2))

model.add(layers.Dense(24,
activation='relu'))

https://www.codecademy.com/learn/intro-to-deep-learning-with-tensor-flow/modules/intro-to-tensorflow/cheatsheet 7/8
25/4/24, 05:52 Intro to Deep Learning with TensorFlow: Introduction to TensorFlow Cheatsheet | Codecademy

Print Share

https://www.codecademy.com/learn/intro-to-deep-learning-with-tensor-flow/modules/intro-to-tensorflow/cheatsheet 8/8

You might also like