Paper BackProoagation
Paper BackProoagation
1 Introduction
Technological advancement has evolved and incorporated into different modern de-
vices and into people's daily lives. As is the case of capturing better photography’s, or
the increasingly use of digital assistants on a day-to-day basis. Consequently, people
do not skimp on expenses when it comes to using all the technology that exists in them.
Artificial Intelligence (AI) has become a trend, due to its applicability in a large
number of devices for daily use.
Artificial Neural Networks emulate the functioning of the human brain, formed by a
group of nodes also known as artificial neurons that connect and transmit signals to
each other. The signals transmitted from the input to producing the output and whose
primary objective of said model will be to learn with the continuous and automatic
modification in itself, in such a way that it can perform complex tasks that not be
achieved by means of classical programming that it is based on rules. With this it be
possible to automate functions that previously could only be performed by human be-
ings[1].
The Artificial Vision or Computer Aided Vision has also been developing alongside,
consisting of the automatic deduction of the structure and properties of a three-dimen-
sional, possibly dynamic world, from one or several two-dimensional images of this
world. In this area of knowledge, there are concepts of color physics, optics, electronics,
geometry, algorithms, and computer systems[2]. Computer Vision (CV) has also been
developing alongside AI and Neural Networks, through the analysis of images by com-
puters. CV has been implemented in the field of industry on two fundamental fronts:
obtaining the greatest interaction between machines and their environment; and achieve
total quality control in the products that are manufactured[2].
2
The development of new algorithms and the invention of new digital cameras has
been the beginning in which the spectrum of applications has expanded in recent years.
As is the case of new digital cameras and photographic applications, which can be de-
tected by means of facial recognition the face of the person in the image.
This type of technological advance has opened up new fields of application in indus-
try such as, security, automatic and autonomous driving of vehicles, new ways of inter-
acting with controllers through gestures with the hand the movement of eyes and the
growing boom Internet of Things (IoT)[3].
Artificial vision consists of the automatic deduction of the structure and properties
of a three-dimensional world, possibly dynamic, from one or more two-dimensional
images of that world. In this area of knowledge, concepts of color physics, optics, elec-
tronics, geometry, algorithmic, and computer systems are combined[4].
2 Images
Where N and M can be any numerical value within the natural numbers and the val-
ues of each element must be multiples of 2:
0 <= 𝐼 (𝑥, 𝑦) <= 𝑝 − 1 / 𝑝 = 2! (2)
The values contained in each of the elements of the image i represented from dark
levels of luminosity to the lightest values. The darkest level corresponds to the lowest
value of the interval represented by the color black, and the lightest level represented
by the color white and corresponds to the highest value.
The image is a two-dimensional function that provides certain electromagnetic in-
formation for each of its values. Each of these discrete elements is a point or pixel and
generally contains the lighting level, or the color of a point in the scene. The set of
points or pixels form the image or photograph of the scene. The image comes from the
spectral representation received by a sensor. Being the color generated by the superpo-
sition of three spectral components[5].
3
2.1 Lighting
During the design of artificial vision systems, uncontrolled lighting of the environment
is usually not acceptable since images with low contrast, specular reflections, being
shadows and flares obtained. A well-designed lighting system provides light to the
scene in such a way that the image obtained favor the subsequent process on it, main-
taining and even improving the information necessary for the detection and extraction
of objects and characteristics of interest[6].
Two types of lighting can be consider, those that can affect image processing (natural
light such as the sun, moon, and stars) and artificial lights. Artificial are the most com-
monly used in artificial vision, since these can be adjusted depending on the physical
environment that surrounds it; the same one that produces the transformation of elec-
trical energy into photons and produces the different types of lighting with the most
common and their own characteristics. That define a type of lighting, which is based
on the range of wavelengths emitted by the light, the durability at the time of the wave-
length and the variations produced over time and the temperature that can be reach in
that instant[6].
Backlighting It is directed at the Image edges are sharp Image details are not
shadow of the image visible to the naked eye
Oblique It is directed only at the The shadow of the im- It only focuses on a
object obliquely age will be skewed by single object
the light
Dark-Ground It is directed with trans- Useful for highlighting The contrast at the
(Fig,1) parent materials cracks and bubbles edges is decreased
within an object.
Coaxial It is directed with illumi- The edges are well de- The surface is not well
nation through a mirror fined defined
Figure 1 shows an example of images to which the different lighting techniques ap-
plied, fin this case direct lighting and dark-ground lighting.
Fig. 1. Image capture applying Direct lighting (Left) and Dark-Ground lighting (Right). Source:
[2]
3 Image processing
Image processing is a set of techniques that facilitates the search for information from
an image or object, through information extraction techniques such as optical methods
or by digital means such as the computer, applying different mathematical models, al-
gorithms, among others.
5
This type of processing uses the Pixel, which is the physical unit of color that is part
of the digital image; the images are formed by the succession of pixels, which presents
a coherence in the information that is displayed, constituting an information matrix.
Digital image processing is performed by dividing the image into a matrix of values
or pixels; it is here where a numerical value is assigned to the average brightness of
each pixel and the coordinates of its position, resulting in an image definition complete.
There are several mathematical models that allow the representation of colors in nu-
merical form, using chromatic values: RGB (Red, Green, Blue) and CMYK (Cyan,
Magenta, Yellow, Key), which associates a numeric vector in a space color.
3.1 Binarization
Binarization is the process to move to simpler colors, which in this case is black and
white. This technique allows reference different regions of the image, which have a
similar intensity distribution, making the gray scale histogram show which areas belong
to what regions. Threshold binarization is a technique that allows obtaining as much
information as possible, since it identifies a threshold as an intensity value from a cer-
tain group of pixels, these will be considered within another subgroup and classified as
white, while the other subgroup will be classified black.
For the detection of the threshold, the histogram of the image is essential, since it
defines the intensity levels by means of their relative frequency.
Assuming that the objects have intensity values greater than those of the environ-
ment (Fig.2), binarization is performed by whether a pixel (𝑥, 𝑦) meets the current
value of that pixel in the image 𝑓, is greater than the threshold, 𝑓 (x, y)> 𝑇, then that
pixel will be considered as the object, otherwise, this point will be considered from the
environment[8].
Explicitly, the threshold value 𝑇 is obtain by operation of the form:
Where, 𝑥 and 𝑦 are the coordinates of the pixel in the image 𝑓, 𝑓 (𝑥, 𝑦) represents
the intensity of the pixel in these coordinates, p (x, y) is a property of the point to dis-
criminate. This threshold obtained by the result of the equation, you can define a binary
image 𝑔 (𝑥, 𝑦),
0 𝑠𝑖 𝑓(𝑥,𝑦)>𝑇
𝑔(𝑥, 𝑦) = { } (4)
1 𝑠𝑖 𝑓(𝑥,𝑦) ≤𝑇
Examining 𝑔(𝑥, 𝑦) it is possible to observe that the pixels assigned a value of 0 are
objects, while those with a value of 1 are the environment. In Fig 2, there is an example,
with a grayscale image, and its respective image segmented with different thresholds.
6
Fig. 2. Original image in grayscale, binarized image with threshold 𝑇 = 0.3 and with threshold
𝑇 = 0.6. Source: [5]
Fig. 3 Laplacian Edge Detector applying the second Derivative (◼ Laplacian: measures changes
in the gradient; ◼ Gradient difference). Source: [6]
7
Artificial Neural Networks (ANN) are computer systems that try to emulate the func-
tioning of the human brain, from the point of view of learning and adaptation. ANN
also called Distributed Processing Systems is a parallel process that is massively dis-
tributed and that stores experimental knowledge[12].
The ANNs have as main characteristics the learning that is through examples or
samples, and the interconnection weights "Synapse" that are adjust while learning
(Fig.4).
1
𝐸 (→) ≡ ∑𝑑 ∈ 𝐷 ∑ 𝑘 ∈ 𝑜𝑢𝑡𝑝𝑢𝑡𝑠(𝑡𝑘𝑑 − 𝑜𝑘𝑑 ) (5)
𝑊 2
9
Where each parameter represents: (→) Weights vector; 𝐷 Set of training weights; 𝑑
𝑊
Concrete training; 𝑂𝑢𝑡𝑝𝑢𝑡𝑠 Neurons vector output; 𝑘 Output neuron; 𝑡𝑘𝑑 Correct out-
put that the output neuron 𝑘 should give when applying the training example to the
network 𝑑; 𝑂𝑘𝑑 Output calculated by the output neuron 𝑘 when applying the training
example to the network 𝑑.[15]
The Object Recognition System (ORS) was develop in the C # programming language
and with several Open Sources type libraries. Which allow obtaining the image through
a camera and process the images, making it a little easier to implement edge detection
algorithms, image segmentation as well as neuron training, so that according to the
captured image it has the ability to identify them through a system's own database.
For the processing of the images, the AForge.dll, AForge.Imaging.dll,
AForge.Math.dll library was used, which specializes in binary source code specialized
in Artificial Intelligence and Computer Vision, developed by Andrew Kirillov for the
.NET framework.
The NeuronDotNet.Core.dll library takes care of neuron training, which uses multi-
layer input and 2 layers output. Vijeth Dinesha develop this library as an Artificial In-
telligence project.
At the moment of execute the project, a screen will appear (Fig.6) where the different
options can be shown with which the program will start to carry out the processing of
the image using the binarization and detection of edges.
10
Once the image treatment has done, then network process is perform, with which the
neuron is train using the Hough Moments, and the affine matrix which values are obtain
to perform the RGB Histogram (Fig. 7).
Fig. 7. ORS Neuron Training and obtaining the Weights and Related Values and Hough Moments
Screen.
11
To perform the training the Error Correction Learning algorithm is applied, with
which the weight adjustments are made, differentiating the types of leaves, and the
Hough moments; and as input parameters the number of iterations and the percentage
(%) of error tolerance.
Once the neuron is trained, the last step (Processing) is carried out, and the values of
the weights, Hough Moments, and the Affine are observed, with which the Histogram
of the image can be performed (Fig. 8).
Fig. 8. ORS Processing Screen, where the Weights, Hough values, and others are identified, in
addition to the Histogram.
Once the entire Neuron Training Process is perform, these data are save in a file,
with the values obtained. This file will help so that when a new process executed and
the values are similar, it will automatically identify the same image, appearing it on the
screen.
The result of the implementation of the Backpropagation Algorithm is the integra-
tion of the different processes of Processing Images and Learning by Correction of Er-
rors, which allows obtaining small output values, according to its structure and training
weights.
Backpropagation is a learning algorithm used in Multi Layers of Perceptron, being
use to solve complex problems in an efficient way, turning out to be more efficient than
traditional programs. Witch can be used in industry as part of Process Control, which
allows you to identify whether a manufactured product has a defect in its structure, with
the quality of the improved product allowing for the improvement of raw material.
12
Conclusions
There are several models and algorithms of Artificial Neural Networks, even more
efficient than the one implemented in this work, like the case of the Kohonen algorithm
that is used in many applications of transit, global positioning, among others.
Currently there are tools that perform facial recognition, which use their own
knowledge base making it more efficient, they also use more advanced algorithms such
as Hopfield, Kohonen.
Amazon has its recognition tool called Amazon Rekognition that is used for facial
recognition through mobile devices.
What makes artificial intelligence a tool used to improve times and processes in the
industry, since it can be monitored in real time, applying the necessary corrections, so
that it is not affected by any errors in production. The aim is to motivate the research
and implementation of this type of technology, not only for industry, but also in every-
day life, such as in intelligent buildings, thus giving a differential to this type of device.
The purpose of this work is to study the Backpropagation algorithm as a start towards
neural networks and their possible utilities, knowing that there are many learning algo-
rithms that are being used in other programming languages such as Python.
What is intended is that it can be applied in other programming languages and for
other uses.
We recommend that this type of research work, be use to motivate the development
and improvement of new projects, involving new methodologies of Artificial Intelli-
gence and Image Processing, since nowadays they are on the rise, thanks to their low
cost, and their easy learning. Thus, innovative projects with social benefit can be per-
form, such as facial recognition, which would help public security to recognize people
who have committed illicit acts, improving the quality of our society.
References
1. W. Rivas Asanza and B. Mazón Olivo, Redes neuronales artificiales aplicadas al
reconocimiento de patrones, 2016th ed. Machala: Editorial UTMACH.
2. G. Pajares Martinsanz, Conceptos y métodos en Visión por Computador. España: Grupo
de Visión del Comité de Automática (CEA), 2016.
3. F. PÉREZ, A. Félix, and J. L. GUERRA, “Internet de las Cosas,” Perspectiv@s, vol. 10,
pp. 45–46, 2017.
4. J. L. GONZALEZ GALVIS and J. A. PARRA ABRIL, “DISEÑO E
IMPLEMENTACIÓN DE UN SISTEMA DE RECONOCIMIENTO DE NARANJAS
PARA EL ROBOT GIO 1 USANDO VISIÓN ASISTIDA POR COMPUTADOR,”
2015.
5. R. C. Gonzalez and R. E. Woods, Digital Image Precessing. Chicago: Pearson, 2008.
6. D. E. R, Machine Vision: Theory, Algorithms, Practicalities. Oxford: Academic Press,
2015.
7. S. van der Walt, J. Schönberger, J. Nunez-Iglesias, and F. Boulogne, “scikit-image:
image processing in Python,” Peerj, 2014.
8. A. Ruiz C and S. Basualdo M, Redes Neuronales: Conceptos Básicos y Aplicaciones.
Rosario: PubliEditorial, 2001.
13