0% found this document useful (0 votes)
20 views5 pages

Taud 2017

This document provides an overview of multilayer perceptrons (MLPs), which are artificial neural networks used for modeling nonlinear relationships in data. MLPs consist of multiple layers of nodes - an input layer, one or more hidden layers, and an output layer. Information flows from the input to the output layers through the hidden layers. Each connection between nodes has a weight that is optimized during training using an algorithm like backpropagation. MLPs can model complex nonlinear functions by learning from examples of input-output pairs. They have been successfully applied to problems like land change modeling that involve many interacting factors.

Uploaded by

bob
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views5 pages

Taud 2017

This document provides an overview of multilayer perceptrons (MLPs), which are artificial neural networks used for modeling nonlinear relationships in data. MLPs consist of multiple layers of nodes - an input layer, one or more hidden layers, and an output layer. Information flows from the input to the output layers through the hidden layers. Each connection between nodes has a weight that is optimized during training using an algorithm like backpropagation. MLPs can model complex nonlinear functions by learning from examples of input-output pairs. They have been successfully applied to problems like land change modeling that involve many interacting factors.

Uploaded by

bob
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Chapter 27

Multilayer Perceptron (MLP)

H. Taud and J.F. Mas

Abstract Artificial Neural networks have been found to be outstanding tools able
to generate generalizable models in many disciplines. In this technical note, we
present the multi-layer perceptron (MLP) which is the most common neural
network.

Keywords Calibration  Neural networks  Non-linear relationships  Back


propagation

1 Short Description of Interest

Artificial Neural Networks (ANNs) are structures inspired by the function of the
brain. These Networks can perform model function estimation and handle
linear/nonlinear functions by learning from data relationships and generalizing to
unseen situations. One of the popular Artificial Neural Networks (ANNs) is
Multi-Layer Perceptron (MLP). This is a powerful modeling tool, which applies a
supervised training procedure using examples of data with known outputs (Bishop
1995). This procedure generates a nonlinear function model that enables the pre-
diction of output data from given input data.

See Chap. 2 about calibration.

H. Taud (&)
Centro de Innovación y Desarrollo Tecnológico en Cómputo,
Instituto Politécnico Nacional, Mexico City, Mexico
e-mail: [email protected]
J.F. Mas
Centro de Investigaciones en Geografía Ambiental, Universidad Nacional
Autónoma de México (UNAM), Morelia, Michoacán, Mexico
e-mail: [email protected]

© Springer International Publishing AG 2018 451


M.T. Camacho Olmedo et al. (eds.), Geomatic Approaches for Modeling
Land Change Scenarios, Lecture Notes in Geoinformation and Cartography,
https://doi.org/10.1007/978-3-319-60801-3_27
452 H. Taud and J.F. Mas

2 Technical Details

In order to understand the MLP, a brief introduction to the one neuron perceptron
and single layer perceptron is provided. The former represents the simplest neural
network and has only one output to which all inputs are connected. Given i = 0,1,
…,n where n is the number of inputs, the quantities {wi} are the weights of the
neuron. The inputs {xi} correspond to features or variables and the output y to their
predictive binary class. Figure 1 describes the three steps forming the perceptron
model. Figure 2 shows its simplified representation. The weighting step involves
the multiplication of each input feature value by its weight {xiwi} and in the second
step they are added together (x0w0 + x1w1 +  + xnwn). The third is the transfer
step where an activation function f (also called a transfer function) is applied to the
sum producing an output y presented as:

X
n
y ¼ f ðzÞ and z ¼ wi xi ð1Þ
i¼0

x0 ¼ 1; w0 the threshold or bias, and y the output.


The activation function takes various forms. Their common functions are listed
in Table 1.
A perceptron can only learn linearly separable functions from Eq. (1). Figure 3a
shows an example of linear function w1 x1 þ w2 x2 þ w0 ¼ 0 that separates the data
into two classes. In two dimensions with two features, the function is a line. In three
dimensions with three features, it is a plane. In n dimensions, it is a hyperplane with
equation:

Fig. 1 Perceptron steps: from left to right, weighting, sum and transfer steps

Fig. 2 Perceptron model, from left to right: a steps model. b Simplified model
27 Multilayer Perceptron (MLP) 453

Table 1 Some activation functions


Activation function Equation 2D graph

Unit step (Heaviside) 1z  0
f ðzÞ ¼
0z\0

Linear f ðzÞ ¼ z

Logistic (sigmoid) f ð xÞ ¼ 1 þ1ex

Fig. 3 Input patterns, from


left to right: a linearly
separable, b nonlinearly
separable

X
n
wi xi ¼ 0 ð2Þ
i¼0

The Equation (2) can be presented by the dot product between the weight vector
W and the input vector X:

W X ¼0 ð3Þ

With known responses of the input training data, the learning step (also known
as the training step) is completed. The purpose of learning is to optimize the
weights by minimizing a cost function, which is usually a square error between the
known response and the estimated one. Analytical techniques such as gradient
descent determine the optimum weight vector. The algorithm converges to a
solution reaching an operational configuration network. The validation of the model
is achieved using new data in order to show how the configuration can be gener-
alized to new situations.
The parallel connection of many perceptrons generates a single layer perceptron
(SLP) architecture, which is used in the case of various outputs. Figure 4a shows an
example with an input and output layer serving in a linearly separable multiclass case.
The perceptron and the single layer perceptron do not resolve the nonlinearly
separable problem (Fig. 3b). In this case, a solution can be found by adding any
number of layers in successive arrangement and creating a MLP architecture
(Fig. 4b). The output of one layer becomes the input of the next and so on. The first
454 H. Taud and J.F. Mas

Fig. 4 Layer structure:


a SLP with three inputs and
four outputs. b MLP with
three inputs, two hidden
layers, and two outputs

and the last layers are called input and output layers respectively, while the others
are the hidden layers of the neural network.
The MLP is a layered feedforward neural network in which the information
flows unidirectionally from the input layer to the output layer, passing through the
hidden layers (Bishop 1995). Each connection between neurons has its own weight.
Perceptrons for the same layer have the same activation function. In general, it is a
sigmoid for the hidden layers. Depending on the application, the output layer can
also be a sigmoid or a linear function.
Among many other algorithms, the widely known MLP learning algorithm is a
backpropagation, which is a generalization of the Least Mean Squared rule (Du and
Swamy 2014). Weights can be corrected by propagating the errors from layer to
layer starting with the output layer and working backwards, hence the name
backpropagation.
The MLP model performance depends not only on the choice of the variables,
the numbers of hidden layers, nodes, and training data but also on the training
parameters such as learning rate, momentum controlling the weight change, and
number of iterations. A MLP with one hidden layer identifies the nonlinear function
with lower accuracies. Networks with more hidden layers are likely to overfit the
training data. The learning rate and the momentum control the speed and effec-
tiveness of the learning process.
In land change modeling, the analysis of the complex relationships between land
transition and the large number of variables acting as drivers, needs advanced
empirical techniques to find a nonlinear function that describes such a complex
relationship (Mas et al. 2014). Variables such as distance, slope, type of soil, land
tenure, etc. are presented at the input node of the network. Each output node
represents a different land transition (e.g. forest to pasture, forest to cropland, and
forest to urban, etc…) for which explanatory variable values are known, as well as
the land transition observed in the past. After the training step, the MLP is able to
predict the potential change of each transition when new input data is presented to
the network (Pijanowski et al. 2002; Mas et al. 2004).
27 Multilayer Perceptron (MLP) 455

References

Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press, Oxford 482
Du K-L, Swamy MNS (2014) Neuronal networks and statistical learning. Springer, Berlin
Mas JF, Puig H, Palacio JL, Sosa AA (2004) Modelling deforestation using GIS and artificial
neural networks. Environ Model Softw 19(5):461–471
Mas JF, Kolb M, Paegelow M, Camacho Olmedo MT, Houet T (2014) Inductive pattern-based
land use/cover change models: a comparison of four software packages. Environ Model Softw
51:94–111
Pijanowski BC, Brown DG, Shellito BA, Manik GA (2002) Using neural nets and gis to forecast
land use changes: a land transformation model. Comput Environ Urban Syst 26(6):553–575

You might also like