Chapter10 Keras
Chapter10 Keras
Tsz-Chiu Au
[email protected]
» The pixel intensities are represented as integers (from 0 to 255) rather than floats (from
0.0 to 255.0)
• Since we are going to train the neural network using Gradient Descent, we
must scale the input features down to the 0–1 range by dividing them by
255.0:
Naming the Labels
• Unlike MNIST, Fashion MNIST needs the list of class names of
each label to know what the image are:
• The first method for building a neural network in tf.keras is the use of
Sequential API.
» Only for neural neteworks that compose of a single stack of layers connected
sequentially.
• The tf.keras code for building a classification MLP with two hidden layers:
• The model’s summary() method displays the information of the model’s layers:
Accessing the Information of a Model
• Directly get a model’s list of layers:
• All the parameters of a layer can be accessed using its get_weights() and set_weights() methods:
Compiling the Model
• Before training the model, you must compile the model:
• You should check whether overfitting occurs (i.e., accuracy >> val_accuracy)
• Consider passing the class_weight argument if the training set is skewed.
Drawing the Learning Curves
• fit() returns a History object, which contains:
» The training parameters (history.params)
» The list of epochs it went through (history.epoch)
» The loss and extra metrics at the end of each epoch on the training set
and on the validation set (history.history).
• You can draw the learning curves using matplotlib:
Drawing the Learning Curves (cont.)
• The learning curve shows the mean training loss and accuracy
measured over each epoch, and the mean validation loss and
accuracy measured at the end of each epoch:
• When reporting the learning curves, you should shift the training
curve in the above graph by half an epoch to the left.
Continue the Training
• If the model has not converged yet, call fit() again to continue the
training.
• If you are not satisfied with the performance of your model, you
should go back and tune the hyperparameters.
» Tune the learning rate
» Try another optimizer
» Adjust the number of layers, the number of neurons per layer, and the
types of activation functions to use for each hidden layer
» Change the batch size
• Finally, estimate the generalization error using the test set before
you deploy the model to production.
• If you want to know the class with the highest estimated probability only,
use the predict_classes() method instead:
• If you need extra control, you can easily write your own
custom callbacks. For example,
Using TensorBoard for Visualization
• TensorBoard is a great interactive visualization
tool that you can use to
» view the learning curves during training
» compare learning curves between multiple runs
» visualize the computation graph
» analyze training statistics
» view images generated by your model
» visualize complex multidimensional data projected
down to 3D and automatically clustered for you
» etc.
Visualizing Learning Curves with TensorBoard
Using TensorBoard
• To use TensorBoard, you must modify your program so that it outputs the
data you want to visualize to special binary log files called event files.
• Each binary data record is called a summary.
• The TensorBoard server will monitor the log directory, and it will
automatically pick up the changes and update the visualizations.
• In general, you want to point the TensorBoard server to a root log
directory and configure your program so that it writes to a different
subdirectory every time it runs.
• Define the root log directory for TensorBoard logs
Using TensorBoard (cont.)
• Keras provides the TensorBoard() callback:
• Once the server is up, you can open a web browser and go to
http://localhost:6006
• To use TensorBoard directly within Jupyter:
Using TensorBoard (cont.)
• TensorFlow offers a lower-level API in the tf.summary package.
» E.g., you can create a SummaryWriter using the create_file_writer() function,
and it uses this writer as a context to log scalars, histograms, images, audio,
and text, all of which can then be visualized using TensorBoard
What we’ve learned so far
• We learnt about the history of neural nets research.
• What an MLP is and how you can use it for
classification and regression
• How to use tf.keras’s Sequential API to build MLPs
• How to use the Functional API or the Subclassing API to
build more complex model architectures
• How to save and restore a model
• How to use callbacks for checkpointing, early stopping,
and more.
• How to use TensorBoard for visualization.
Fine-Tuning Neural Network Hyperparameters
• There are many hyperparameters to tweak:
» E.g., the number of layers, the number of neurons per layer, the type
of activation function to use in each layer, the weight initialization
logic
• How do you know what combination of hyperparameters is
the best?
• One option is to try many combinations of hyperparameters
and see which one works best on the validation set (or use K-
fold cross-validation).
» E.g., use GridSearchCV or RandomizedSearchCV to explore the
hyperparameter space.
Fine-Tuning Hyperparameters (cont.)