0% found this document useful (0 votes)

100 views18 pages

Artificial Intelligence - Chapter 7

This document discusses artificial neural networks and deep learning. It begins by explaining how neural networks can find patterns in large datasets using tools like Keras. It then describes different neural network architectures like perceptrons, multi-layer perceptrons, convolutional neural networks and recurrent neural networks. It discusses how these networks are trained using backpropagation and stochastic gradient descent. It also covers topics like activation functions, network topology, weight updates, and using neural networks for classification problems.

Uploaded by

libanmhassan12

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

100 views18 pages

Artificial Intelligence - Chapter 7

Uploaded by

libanmhassan12

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Artificial Intelligence

Chapter 7: Deep Learning

Artificial Neural Networks
➢ In the past few years, research has quickly blossomed around neural networks. With widely available open
source tools, the power of neural networks to find patterns in large datasets quickly transformed the NLP
landscape.
➢ The nature of words and their secrets are most tightly correlated to their relation to each other, which can
be expressed in at least two ways:
○ Word order (spatially): you examine the statement as if written on page - you’re looking for
relationships in the position of words
○ Word proximity (temporally): you explore it as if spoken—the words and letters become time series
data
➢ Basic feedforward networks (multilayer perceptron) are capable of pulling patterns out of data, but it
doesn’t capture the relations of the tokens spatially or temporally.
➢ But feed forward is only the beginning of the neural network architectures out there.
➢ The two most important choices for natural language processing are currently convolutional neural nets and
recurrent neural nets
The Perceptron
➢ The Perceptron is one of the simplest ANN architectures, invented in 1957 by Frank Rosenblatt. A
perceptron is a single neuron model.
➢ It is based on a slightly different artificial neuron called a threshold logic unit (TLU), or sometimes a
linear threshold unit (LTU):
➢ the inputs and output are now numbers (instead of binary on/off values) and each input connection is
associated with a weight.

➢ The TLU computes a weighted sum of its

inputs (z = w 1 x 1 + w 2 x 2 + ⋯ + w n x n =
x T w), then applies a step function to that
sum and outputs the result: h w (x) =
step(z), where z = x T w.
Multi-Layer Perceptron (MLP)
➢ An MLP is composed of one (passthrough) input layer, one or more layers of
TLUs, called hidden layers, and one final layer of TLUs called the output layer
○ The layers close to the input layer are usually called the lower layers, and the ones close to the
outputs are usually called the upper layers
○ When an ANN contains a deep stack of hidden layers , it is called a deep neural network (DNN)
Every layer except the output layer includes
a bias neuron and is fully connected to the
next layer
Multi-Layer Perceptron and Backpropagation
➢ For many years researchers struggled to find a way to train MLPs, without
success
➢ But in 1986, David Rumelhart, Geoffrey Hinton and Ronald Williams published
a groundbreaking paper introducing the backpropagation training algorithm,
which is still used today
➢ In just two passes through the network (one forward, one backward), the
backpropagation algorithm is able to compute the gradient of the network’s
error with regards to every single model parameter
➢ In other words, it can find out how each connection weight and each bias term
should be tweaked in order to reduce the error.
Multi-Layer Perceptron and Backpropagation
➢ Let’s run through this algorithm in a bit more detail
○ It handles one mini-batch at a time (for example containing 32 instances each), and it goes
through the full training set multiple times. Each pass is called an epoch
○ Each mini-batch is passed to the network’s input layer, which just sends it to the first hidden
layer. The algorithm then computes the output of all the neurons in this layer (for every
instance in the mini-batch). The result is passed on to the next layer until we get the output of
the last layer, the output layer. This is the forward pass:
○ Next, the algorithm measures the network’s output error by using loss function
○ Then it computes how much each output connection contributed to the error
○ In conclusion, for each training instance the backpropagation algorithm first makes a prediction
(forward pass), measures the error, then goes through each layer in reverse to measure the
error contribution from each connection (reverse pass), and finally slightly tweaks the
connection weights to reduce the error (Gradient Descent step)
Activation
➢ The weighted inputs are summed and passed through an activation function, sometimes
called a transfer function
○ An activation function is a simple mapping of summed weighted input to the output neuron
○ Historically simple step activation functions were used where if the summed input was above a
threshold, for example 0.5, then the neuron would output a value of 1.0, otherwise it would output a
0.0, linear activation function.

➢ Nonlinear activation functions allow the network to combine the inputs in more complex
ways and in turn provide a richer capability in the functions they can model
○ Nonlinear functions like the logistic function also called the sigmoid function were used that output a
value between 0 and 1 with an s-shaped distribution
○ the hyperbolic tangent function also called Tanh that outputs the same distribution over the range -
1 to +1
○ More recently the rectifier activation function has been shown to provide better results
Networks of Neurons
➢ Neurons are arranged into networks of
neurons. A row of neurons is called a layer
and one network can have multiple layers.
The architecture of the neurons in the
network is often called the network topology
○ Input or Visible Layers: The bottom layer that takes
input from your dataset

○ Hidden Layers: Layers after the input layer are called

hidden layers because they are not directly exposed
to the input. Deep learning can refer to having many
hidden layers in your neural network.

○ Output Layer: The final hidden layer is called the

output layer and it is responsible for outputting a
value or vector of values
Training Networks
➢ Data Preparation
○ Data must be numerical, for example real values. If you have categorical data, such as a sex attribute with the
values male and female, you can convert it to a real-valued representation called a one hot encoding

➢ Stochastic Gradient Descent

○ The classical and still preferred training algorithm for neural networks is called stochastic gradient descent

○ This is where one row of data is exposed to the network at a time as input. The network processes the input
upward activating neurons as it goes to finally produce an output value. This is called a forward pass on the
network

○ The output of the network is compared to the expected output and an error is calculated

○ This error is then propagated back through the network, one layer at a time, and the weights are updated
according to the amount that they contributed to the error, the Back Propagation algorithm

○ The process is repeated for all of the examples in your training data. One round of updating the network for
the entire training dataset is called an epoch
Training Networks
➢ Weight Updates
○ The weights in the network can be updated from the errors calculated for each training example and this is
called online learning

○ Alternatively, the errors can be saved up across all of the training examples and the network can be updated
at the end. This is called batch learning and is often more stable.

○ The amount that weights are updated is controlled by a configuration parameter called the learning rate

○ Learning rate controls the step or change made to network weights for a given error, often small learning
rates are used such as 0.1 or 0.01 or smaller

➢ Prediction
○ Once a neural network has been trained it can be used to make predictions.

○ You can make predictions on test or validation data in order to estimate the skill of the model on unseen data

○ You can also deploy it operationally and use it to make predictions continuously
Classification MLPs
➢ For a binary classification problem, you just need a single output neuron using
the logistic activation function: the output will be a number between 0 and 1,
which you can interpret as the estimated probability of the positive class.
➢ If each instance can belong only to a single class, out of 3 or more possible
classes (e.g., classes 0 through 9 for digit image classification), then you need
to have one output neuron per class, and you should use the softmax
activation function for the whole output layer, this is called multiclass
classification
➢ Regarding the loss function, since we are predicting probability distributions,
the cross-entropy (also called the log loss) is generally a good choice
Deep Learning Frameworks
➢ Choosing a deep learning framework is no easy task, but we will stick with Keras for our deep
learning tasks. The Python ecosystem for deep learning is certainly thriving now, for example:
○ TensorFlow ( https:/ / www. tensorflow. org/ ): TensorFlow is a neural network library released by Google,
and also happens to be the same framework that their artificial intelligence team, Google Brains uses
○ Theano ( http:/ / deeplearning. net/ software/ theano/ ): Arguably one of the first thorough deep learning
frameworks, it was built at MILA by Yoshia Bengio, one of the pioneers of deep learning
○ Caffe ( http:/ / caffe. berkeleyvision. org/ ) & Caffe2 ( https:/ / caffe2. ai/ ): Caffe is one of the first
dedicated deep learning frameworks, developed at UC Berkeley
○ PyTorch ( https:/ / pytorch. org/ ): The new kid on the block but also a library which is growing rapidly and
Facebook Artificial Intelligence Research team (FAIR) has endorsed PyTorch
○ Keras ( https:/ / keras. io/ ): With its high level of abstraction and clean API, it remains the best deep
learning framework for prototyping and can use either Theano or TensorFlow as the backend for
constructing the networks. It is very easy to go from the idea -> execution.
Your First Deep Learning Project in Python with Keras Step-By-Step

➢Load Data The imports

○ The first step is to define the functions and classes we intend to

required are listed

use.
○ We will use the NumPy library to load our dataset and we will use
two classes from the Keras library to define our model.
load the dataset and split the array into two
arrays - 8 columns and the 9th variable
➢ We can now load Pima Indians onset of diabetes dataset
○ It describes patient medical record data for Pima Indians and
whether they had an onset of diabetes within five years

○ it is a binary classification problem (onset of diabetes as 1 or not as

○ All of the input variables that describe each patient are numerical.
This makes it easy to use directly with neural networks that expect
numerical input and output values y = f(X) - mapping rows of input
variables (X) to an output variable (y)
Your First Deep Learning Project in Python with Keras Step-By-Step

➢Define the Keras model

○ Models in Keras are defined as a sequence of layers
○ We create a Sequential() model and add layers one at a time until we are happy with our network
architecture
○ The first thing to get right is to ensure the input layer has the right number of input features
➢ How do we know the number of layers and their types?
○ the best network structure is found through a process of trial and error experimentation
○ Generally, you need a network large enough to capture the structure of the problem
○ In this example, we will use a fully-connected network structure with three layers
○ Fully connected layers are defined using the Dense() layer
○ We can specify the number of neurons or nodes in the layer as the first argument and specify the
activation function as well. The model expects rows of
data with 8 variables (the
input_dim=8 argument)
The first hidden layer has 12
nodes and uses the relu
The output layer has one
activation function
node and uses the sigmoid
activation function
Your First Deep Learning Project in Python with Keras Step-By-Step

➢Fit Keras model

○ We can train or fit our model on our loaded data by calling the fit() function on the model
○ Training occurs over epochs and each epoch is split into batches
○ One epoch is comprised of one or more batches, based on the chosen batch size and the model is fit for
many epochs
○ The training process will run for a fixed number of iterations through the dataset called epochs, that
we must specify using the epochs argument
○ We must also set the number of dataset rows that are considered before the model weights are
updated within each epoch, called the batch size and set using the batch_size argument

train the model so that it learns a good mapping of These configurations can be chosen The model will always have some error, but the

rows of input x to the output y classification amount of error will level out after some point for a
experimentally by trial and error
given model configuration - model convergence
Your First Deep Learning Project in Python with Keras Step-By-Step

➢Compile Keras model

○ Compiling the model uses the efficient numerical libraries under the covers (the so-called backend)
such as Theano or TensorFlow
○ When compiling, we must specify some additional properties required when training the network
○ Remember training a network means finding the best set of weights to map inputs to outputs in our
dataset
○ We must specify the loss function to use to evaluate a set of weights, the optimizer is used to search
through different weights for the network

We must specify the loss function to the optimizer is used to search through it is a classification problem, we will collect and
report the classification accuracy during training
use to evaluate a set of weights different weights for the network
Your First Deep Learning Project in Python with Keras Step-By-Step

➢Evaluate Keras model

○ We have trained our neural network on the entire dataset and we can evaluate the performance of the
network on the same dataset
○ This will only give us an idea of how well we have modeled the dataset (e.g. train accuracy), but no
idea of how well the algorithm might perform on new data
○ It is better to separate your data into train and test datasets for training and evaluation of your model
○ You can evaluate your model on your training dataset using the evaluate() function on your model and
pass it the same input and output used to train the model
○ This will generate a prediction for each input and output pair and collect scores, including the average
loss and any metrics you have configured, such as accuracy

The evaluate() function will return We are only interested in reporting the
a list with two values accuracy, so we will ignore the loss value
THANKS

Measurement and Evaluation in Human Performance (James R Morrow JR., Dale P. Mood Etc.)
50% (2)
Measurement and Evaluation in Human Performance (James R Morrow JR., Dale P. Mood Etc.)
759 pages
Action Research
83% (6)
Action Research
61 pages
T Barge Analysis
90% (10)
T Barge Analysis
27 pages
Deep Learning PDF
100% (1)
Deep Learning PDF
87 pages
Statistical Computing Using Statistical Computing Using
No ratings yet
Statistical Computing Using Statistical Computing Using
128 pages
Splunk Questions and Answers Final Document
No ratings yet
Splunk Questions and Answers Final Document
128 pages
Chapter 0 - Multiple Regression Models
No ratings yet
Chapter 0 - Multiple Regression Models
34 pages
Cours 1 - Intro To Deep Learning
100% (1)
Cours 1 - Intro To Deep Learning
38 pages
(Ebook PDF) Discovering Statistics Using IBM SPSS Statistics 4th Download
100% (1)
(Ebook PDF) Discovering Statistics Using IBM SPSS Statistics 4th Download
55 pages
Unit 1 Fundamentals of Deep Learning
No ratings yet
Unit 1 Fundamentals of Deep Learning
20 pages
Notes ML 02 Slides RNN ANN
No ratings yet
Notes ML 02 Slides RNN ANN
105 pages
Introduction To Data Science
No ratings yet
Introduction To Data Science
2 pages
Experimental Psychology Finals Reviewer (Myers Hansen)
No ratings yet
Experimental Psychology Finals Reviewer (Myers Hansen)
17 pages
Unit 1
No ratings yet
Unit 1
16 pages
Lesson 3 Artificial Neural Network
No ratings yet
Lesson 3 Artificial Neural Network
77 pages
4.0 The Complete Guide To Artificial Neural Networks
No ratings yet
4.0 The Complete Guide To Artificial Neural Networks
23 pages
DL - Unit II
No ratings yet
DL - Unit II
78 pages
Deep Learning UNIT 1
No ratings yet
Deep Learning UNIT 1
22 pages
UNIT II DL
No ratings yet
UNIT II DL
17 pages
Unit - 4
No ratings yet
Unit - 4
17 pages
Unit 4 Neural Networks
No ratings yet
Unit 4 Neural Networks
76 pages
Chapter 4 2025
No ratings yet
Chapter 4 2025
19 pages
Unit - 2
No ratings yet
Unit - 2
24 pages
Practice - IM - Linear Regression - 1-2-3-4 - Practice - IM - Loglinear Question - 1 - Q 9-38-Extra Question
No ratings yet
Practice - IM - Linear Regression - 1-2-3-4 - Practice - IM - Loglinear Question - 1 - Q 9-38-Extra Question
14 pages
Introduction Deep Eng
No ratings yet
Introduction Deep Eng
50 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
3 pages
Unit III
No ratings yet
Unit III
29 pages
Unit 1
No ratings yet
Unit 1
72 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
75 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
35 pages
Unit 2 - ML
No ratings yet
Unit 2 - ML
18 pages
Unit 5 ML
No ratings yet
Unit 5 ML
37 pages
Machine Learning: Feed Forward Neural Networks Backpropagation Algorithm Cnns and Rnns
No ratings yet
Machine Learning: Feed Forward Neural Networks Backpropagation Algorithm Cnns and Rnns
127 pages
cst414 - Deep Learning
No ratings yet
cst414 - Deep Learning
34 pages
The Role of Savings and Credit Cooperative Society in Poverty Reduction in Ntungamo Municipality A Case Study of Kajara People's Cooperative Savings and Credit Society Ltd.
No ratings yet
The Role of Savings and Credit Cooperative Society in Poverty Reduction in Ntungamo Municipality A Case Study of Kajara People's Cooperative Savings and Credit Society Ltd.
8 pages
ML Unit-2
No ratings yet
ML Unit-2
141 pages
Unit-3 ML
No ratings yet
Unit-3 ML
21 pages
Neural Network
No ratings yet
Neural Network
18 pages
Machine Learning
No ratings yet
Machine Learning
83 pages
AIDS Module 4
No ratings yet
AIDS Module 4
29 pages
CP4252 ML Unit - V
No ratings yet
CP4252 ML Unit - V
17 pages
Neural NetworksChapter2Sup
No ratings yet
Neural NetworksChapter2Sup
20 pages
Budget Allocation
100% (1)
Budget Allocation
21 pages
cz4041 9 Ensemble
No ratings yet
cz4041 9 Ensemble
54 pages
Chapter 6. Sampling: Sampling: Design and Procedures
No ratings yet
Chapter 6. Sampling: Sampling: Design and Procedures
28 pages
Supervised Learning Network Introduction: Unit 2
No ratings yet
Supervised Learning Network Introduction: Unit 2
52 pages
Chapter 6 AI
No ratings yet
Chapter 6 AI
52 pages
ML Unit 4
No ratings yet
ML Unit 4
23 pages
Eng PPT Tech
No ratings yet
Eng PPT Tech
18 pages
Neural Networks
No ratings yet
Neural Networks
10 pages
Unit 5
No ratings yet
Unit 5
61 pages
MLP 1122 20240509 ch10 DeepNN
No ratings yet
MLP 1122 20240509 ch10 DeepNN
47 pages
CardiffMet PHD Final PDF
No ratings yet
CardiffMet PHD Final PDF
513 pages
UNIT 3 - Backpropagation Algorithm
No ratings yet
UNIT 3 - Backpropagation Algorithm
38 pages
ANN Research
No ratings yet
ANN Research
18 pages
Shortnotedeeplearning
No ratings yet
Shortnotedeeplearning
11 pages
The Effects of Parental Socio Economic S
No ratings yet
The Effects of Parental Socio Economic S
10 pages
Neural Network
No ratings yet
Neural Network
7 pages
Neural Networks
No ratings yet
Neural Networks
10 pages
Lecture Slides-Week13,14
No ratings yet
Lecture Slides-Week13,14
62 pages
Chapter 05
No ratings yet
Chapter 05
25 pages
Unit 1
No ratings yet
Unit 1
19 pages
Unit Iv DM
No ratings yet
Unit Iv DM
58 pages
Lecture 1
No ratings yet
Lecture 1
38 pages
Notes DL-1
No ratings yet
Notes DL-1
10 pages
Ca 3 DL
No ratings yet
Ca 3 DL
6 pages
3rd Unit ML
No ratings yet
3rd Unit ML
7 pages
Unit 1
No ratings yet
Unit 1
20 pages
Neural Networks
No ratings yet
Neural Networks
29 pages
09-Neural Networks
No ratings yet
09-Neural Networks
18 pages
CC511 Week 5 - 6 - NN - BP
No ratings yet
CC511 Week 5 - 6 - NN - BP
62 pages
The Impact of Manpower Training and Development On Organization Efficiency
No ratings yet
The Impact of Manpower Training and Development On Organization Efficiency
37 pages
Research Methods For Managers
No ratings yet
Research Methods For Managers
57 pages
AI Lab 1
No ratings yet
AI Lab 1
11 pages
Edu 211 Sample Past Papers With Answers
No ratings yet
Edu 211 Sample Past Papers With Answers
35 pages
BJMC 14 Block 02
No ratings yet
BJMC 14 Block 02
44 pages
Chapter 5
No ratings yet
Chapter 5
63 pages
Service Quality PDF
No ratings yet
Service Quality PDF
33 pages
Chapter 2 - Artificial Neural Networks
No ratings yet
Chapter 2 - Artificial Neural Networks
19 pages
Bma2102 - Probability - Statistics Ii Solutions
No ratings yet
Bma2102 - Probability - Statistics Ii Solutions
8 pages
The Use of Old Man Cactus
No ratings yet
The Use of Old Man Cactus
17 pages
Diploma in Management General
No ratings yet
Diploma in Management General
11 pages
Bbs Project Work BRICK INDUSTRY
No ratings yet
Bbs Project Work BRICK INDUSTRY
14 pages
Chapter 1 To 3 Defense
No ratings yet
Chapter 1 To 3 Defense
18 pages
QMS 105 Group Assignment
No ratings yet
QMS 105 Group Assignment
4 pages
Statistical Analysis, Chapter 4
No ratings yet
Statistical Analysis, Chapter 4
31 pages
Nguyen Et Al., 2020
No ratings yet
Nguyen Et Al., 2020
9 pages
Assignment 9
No ratings yet
Assignment 9
4 pages