Artificial Intelligence - Chapter 7
Artificial Intelligence - Chapter 7
➢ Nonlinear activation functions allow the network to combine the inputs in more complex
ways and in turn provide a richer capability in the functions they can model
○ Nonlinear functions like the logistic function also called the sigmoid function were used that output a
value between 0 and 1 with an s-shaped distribution
○ the hyperbolic tangent function also called Tanh that outputs the same distribution over the range -
1 to +1
○ More recently the rectifier activation function has been shown to provide better results
Networks of Neurons
➢ Neurons are arranged into networks of
neurons. A row of neurons is called a layer
and one network can have multiple layers.
The architecture of the neurons in the
network is often called the network topology
○ Input or Visible Layers: The bottom layer that takes
input from your dataset
○ This is where one row of data is exposed to the network at a time as input. The network processes the input
upward activating neurons as it goes to finally produce an output value. This is called a forward pass on the
network
○ The output of the network is compared to the expected output and an error is calculated
○ This error is then propagated back through the network, one layer at a time, and the weights are updated
according to the amount that they contributed to the error, the Back Propagation algorithm
○ The process is repeated for all of the examples in your training data. One round of updating the network for
the entire training dataset is called an epoch
Training Networks
➢ Weight Updates
○ The weights in the network can be updated from the errors calculated for each training example and this is
called online learning
○ Alternatively, the errors can be saved up across all of the training examples and the network can be updated
at the end. This is called batch learning and is often more stable.
○ The amount that weights are updated is controlled by a configuration parameter called the learning rate
○ Learning rate controls the step or change made to network weights for a given error, often small learning
rates are used such as 0.1 or 0.01 or smaller
➢ Prediction
○ Once a neural network has been trained it can be used to make predictions.
○ You can make predictions on test or validation data in order to estimate the skill of the model on unseen data
○ You can also deploy it operationally and use it to make predictions continuously
Classification MLPs
➢ For a binary classification problem, you just need a single output neuron using
the logistic activation function: the output will be a number between 0 and 1,
which you can interpret as the estimated probability of the positive class.
➢ If each instance can belong only to a single class, out of 3 or more possible
classes (e.g., classes 0 through 9 for digit image classification), then you need
to have one output neuron per class, and you should use the softmax
activation function for the whole output layer, this is called multiclass
classification
➢ Regarding the loss function, since we are predicting probability distributions,
the cross-entropy (also called the log loss) is generally a good choice
Deep Learning Frameworks
➢ Choosing a deep learning framework is no easy task, but we will stick with Keras for our deep
learning tasks. The Python ecosystem for deep learning is certainly thriving now, for example:
○ TensorFlow ( https:/ / www. tensorflow. org/ ): TensorFlow is a neural network library released by Google,
and also happens to be the same framework that their artificial intelligence team, Google Brains uses
○ Theano ( http:/ / deeplearning. net/ software/ theano/ ): Arguably one of the first thorough deep learning
frameworks, it was built at MILA by Yoshia Bengio, one of the pioneers of deep learning
○ Caffe ( http:/ / caffe. berkeleyvision. org/ ) & Caffe2 ( https:/ / caffe2. ai/ ): Caffe is one of the first
dedicated deep learning frameworks, developed at UC Berkeley
○ PyTorch ( https:/ / pytorch. org/ ): The new kid on the block but also a library which is growing rapidly and
Facebook Artificial Intelligence Research team (FAIR) has endorsed PyTorch
○ Keras ( https:/ / keras. io/ ): With its high level of abstraction and clean API, it remains the best deep
learning framework for prototyping and can use either Theano or TensorFlow as the backend for
constructing the networks. It is very easy to go from the idea -> execution.
Your First Deep Learning Project in Python with Keras Step-By-Step
use.
○ We will use the NumPy library to load our dataset and we will use
two classes from the Keras library to define our model.
load the dataset and split the array into two
arrays - 8 columns and the 9th variable
➢ We can now load Pima Indians onset of diabetes dataset
○ It describes patient medical record data for Pima Indians and
whether they had an onset of diabetes within five years
○ All of the input variables that describe each patient are numerical.
This makes it easy to use directly with neural networks that expect
numerical input and output values y = f(X) - mapping rows of input
variables (X) to an output variable (y)
Your First Deep Learning Project in Python with Keras Step-By-Step
train the model so that it learns a good mapping of These configurations can be chosen The model will always have some error, but the
rows of input x to the output y classification amount of error will level out after some point for a
experimentally by trial and error
given model configuration - model convergence
Your First Deep Learning Project in Python with Keras Step-By-Step
We must specify the loss function to the optimizer is used to search through it is a classification problem, we will collect and
report the classification accuracy during training
use to evaluate a set of weights different weights for the network
Your First Deep Learning Project in Python with Keras Step-By-Step
The evaluate() function will return We are only interested in reporting the
a list with two values accuracy, so we will ignore the loss value
THANKS