0% found this document useful (0 votes)
48 views13 pages

Paper BackProoagation

The document discusses the implementation of an object recognizer using artificial neural networks, specifically focusing on the backpropagation learning algorithm and image processing techniques. It highlights the importance of digital image processing, including object segmentation, edge detection, and the use of Hough transforms for object recognition. Additionally, it emphasizes the role of lighting in image quality and the functioning of artificial neural networks in mimicking human brain processes for learning and adaptation.

Uploaded by

pruebasepn434
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views13 pages

Paper BackProoagation

The document discusses the implementation of an object recognizer using artificial neural networks, specifically focusing on the backpropagation learning algorithm and image processing techniques. It highlights the importance of digital image processing, including object segmentation, edge detection, and the use of Hough transforms for object recognition. Additionally, it emphasizes the role of lighting in image quality and the functioning of artificial neural networks in mimicking human brain processes for learning and adaptation.

Uploaded by

pruebasepn434
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Implementation of an Object Recognizer Through Image

Processing and Backpropagation Learning Algorithm

Abstract. Artificial neural networks in an effort to emulate the operation of the


human brain from the point of view of learning and adaptation, have evolved in
such a way that different statistical and mathematical models have inspired bio-
logical models, example you have to nerve cells or better known as neurons; the
same ones that are composed of dendrites which are responsible for capturing the
nerve impulses emitted by other neurons. The present study aims to analyze the
Backpropagation model and the multilayer topology using an object recognizer
through digital image processing such as object segmentation by detection and
edge determination.

Keywords: Backpropagation, Neural Network, Synapse, RNA, Digital Imag-


ing, Pixel, Machine Vision, Perceptron, Optics.

1 Introduction

Technological advancement has evolved and incorporated into different modern de-
vices and into people's daily lives. As is the case of capturing better photography’s, or
the increasingly use of digital assistants on a day-to-day basis. Consequently, people
do not skimp on expenses when it comes to using all the technology that exists in them.
Artificial Intelligence (AI) has become a trend, due to its applicability in a large
number of devices for daily use.
Artificial Neural Networks emulate the functioning of the human brain, formed by a
group of nodes also known as artificial neurons that connect and transmit signals to
each other. The signals transmitted from the input to producing the output and whose
primary objective of said model will be to learn with the continuous and automatic
modification in itself, in such a way that it can perform complex tasks that not be
achieved by means of classical programming that it is based on rules. With this it be
possible to automate functions that previously could only be performed by human be-
ings[1].
The Artificial Vision or Computer Aided Vision has also been developing alongside,
consisting of the automatic deduction of the structure and properties of a three-dimen-
sional, possibly dynamic world, from one or several two-dimensional images of this
world. In this area of knowledge, there are concepts of color physics, optics, electronics,
geometry, algorithms, and computer systems[2]. Computer Vision (CV) has also been
developing alongside AI and Neural Networks, through the analysis of images by com-
puters. CV has been implemented in the field of industry on two fundamental fronts:
obtaining the greatest interaction between machines and their environment; and achieve
total quality control in the products that are manufactured[2].
2

The development of new algorithms and the invention of new digital cameras has
been the beginning in which the spectrum of applications has expanded in recent years.
As is the case of new digital cameras and photographic applications, which can be de-
tected by means of facial recognition the face of the person in the image.
This type of technological advance has opened up new fields of application in indus-
try such as, security, automatic and autonomous driving of vehicles, new ways of inter-
acting with controllers through gestures with the hand the movement of eyes and the
growing boom Internet of Things (IoT)[3].
Artificial vision consists of the automatic deduction of the structure and properties
of a three-dimensional world, possibly dynamic, from one or more two-dimensional
images of that world. In this area of knowledge, concepts of color physics, optics, elec-
tronics, geometry, algorithmic, and computer systems are combined[4].

2 Images

An image is a two-dimensional representation of a three-dimensional world scene. The


image is the result of the acquisition of a signal provided by a sensor, which converts
the information from the electromagnetic spectrum into numerical encodings[5]. Based
on this, the transformation in the chosen image representation format is part of a dis-
crete information not only in the used values but also in the parameters that define it.
Without this representation, in a generic way, a digital image can be defined as a
matrix of 𝑁𝑥𝑀 dimensions, where each element of this matrix contains a discrete value
that quantifies the information level of the corresponding element, represented with a
finite number of bits (𝑞 ). Thus, the concept or image criterion can be expressed as a
discrete two-dimensional function in the following way:

𝐼 (𝑥, 𝑦) / 0 <= 𝑥 <= 𝑁 − 1, 0 <= 𝑦 <= 𝑀 – 1 (1)

Where N and M can be any numerical value within the natural numbers and the val-
ues of each element must be multiples of 2:
0 <= 𝐼 (𝑥, 𝑦) <= 𝑝 − 1 / 𝑝 = 2! (2)

The values contained in each of the elements of the image i represented from dark
levels of luminosity to the lightest values. The darkest level corresponds to the lowest
value of the interval represented by the color black, and the lightest level represented
by the color white and corresponds to the highest value.
The image is a two-dimensional function that provides certain electromagnetic in-
formation for each of its values. Each of these discrete elements is a point or pixel and
generally contains the lighting level, or the color of a point in the scene. The set of
points or pixels form the image or photograph of the scene. The image comes from the
spectral representation received by a sensor. Being the color generated by the superpo-
sition of three spectral components[5].
3

2.1 Lighting
During the design of artificial vision systems, uncontrolled lighting of the environment
is usually not acceptable since images with low contrast, specular reflections, being
shadows and flares obtained. A well-designed lighting system provides light to the
scene in such a way that the image obtained favor the subsequent process on it, main-
taining and even improving the information necessary for the detection and extraction
of objects and characteristics of interest[6].
Two types of lighting can be consider, those that can affect image processing (natural
light such as the sun, moon, and stars) and artificial lights. Artificial are the most com-
monly used in artificial vision, since these can be adjusted depending on the physical
environment that surrounds it; the same one that produces the transformation of elec-
trical energy into photons and produces the different types of lighting with the most
common and their own characteristics. That define a type of lighting, which is based
on the range of wavelengths emitted by the light, the durability at the time of the wave-
length and the variations produced over time and the temperature that can be reach in
that instant[6].

2.2 Lighting Techniques


The proper functioning of a vision application is fundamentally dependent on lighting.
If adequate lighting is not used, problems in contrasts, brightness and shadows can oc-
cur, which make the inspection algorithm difficult, and if this is the case, there may be
instances where the algorithm cannot find a solution.
If the intention is to improve the vision system, it is necessary to use an adequate
lighting technique, which will be a determining factor in obtaining a correct image to
be processed. An example is an image in which the pixels that represent the objects of
interest in it, have certain characteristics of similar luminosity, and are very different
from the pixels that do not represent other objects of interest. Appropriate lighting is
critical for correct image, which do not appear saturated areas or shadows that hide
information within the image. Shadows cause false edge detections resulting in
measures of either incorrectly[7].
In this context, weak lighting can result in a low signal/noise ratio, which can lead
to a quality image with noisy pixels.
To choose a good lighting will be necessary to determine the actions of each com-
ponent of the vision system at the time of image capture. Each of the components in-
fluences the amount of light that reaches the sensor, and therefore this will depend on
the quality of the captured image. The aperture of the optics diaphragm directly affects
the amount of light reaching the sensor. If the diaphragm is closed, the amount of light
coming from the scene must be increase, or the exposure time must be increase, if the
aim is to achieve an image with the same brightness values. A small area reflects less
light than a large area and as consequence its necessary take into account the point of
view of lighting. In this sense, the advantages and disadvantages of lighting techniques
should be consider (Table 1)[8].
4

Table 1. Advantages and Disadvantages of the different Lighting Techniques.

Technique Description Disadvantages Advantages


Directional It is directed to the object Low price Brightness
(Fig.1)

Diffuse It is directed at the entire Low brightness Not useful in tight


work area places

Backlighting It is directed at the Image edges are sharp Image details are not
shadow of the image visible to the naked eye

Oblique It is directed only at the The shadow of the im- It only focuses on a
object obliquely age will be skewed by single object
the light
Dark-Ground It is directed with trans- Useful for highlighting The contrast at the
(Fig,1) parent materials cracks and bubbles edges is decreased
within an object.
Coaxial It is directed with illumi- The edges are well de- The surface is not well
nation through a mirror fined defined

Structured It is addressed both in Has a visibility of the Loss of color distinc-


visible and non-visible entire object tion.
space

Figure 1 shows an example of images to which the different lighting techniques ap-
plied, fin this case direct lighting and dark-ground lighting.

Fig. 1. Image capture applying Direct lighting (Left) and Dark-Ground lighting (Right). Source:
[2]

3 Image processing

Image processing is a set of techniques that facilitates the search for information from
an image or object, through information extraction techniques such as optical methods
or by digital means such as the computer, applying different mathematical models, al-
gorithms, among others.
5

This type of processing uses the Pixel, which is the physical unit of color that is part
of the digital image; the images are formed by the succession of pixels, which presents
a coherence in the information that is displayed, constituting an information matrix.
Digital image processing is performed by dividing the image into a matrix of values
or pixels; it is here where a numerical value is assigned to the average brightness of
each pixel and the coordinates of its position, resulting in an image definition complete.
There are several mathematical models that allow the representation of colors in nu-
merical form, using chromatic values: RGB (Red, Green, Blue) and CMYK (Cyan,
Magenta, Yellow, Key), which associates a numeric vector in a space color.

3.1 Binarization
Binarization is the process to move to simpler colors, which in this case is black and
white. This technique allows reference different regions of the image, which have a
similar intensity distribution, making the gray scale histogram show which areas belong
to what regions. Threshold binarization is a technique that allows obtaining as much
information as possible, since it identifies a threshold as an intensity value from a cer-
tain group of pixels, these will be considered within another subgroup and classified as
white, while the other subgroup will be classified black.
For the detection of the threshold, the histogram of the image is essential, since it
defines the intensity levels by means of their relative frequency.
Assuming that the objects have intensity values greater than those of the environ-
ment (Fig.2), binarization is performed by whether a pixel (𝑥, 𝑦) meets the current
value of that pixel in the image 𝑓, is greater than the threshold, 𝑓 (x, y)> 𝑇, then that
pixel will be considered as the object, otherwise, this point will be considered from the
environment[8].
Explicitly, the threshold value 𝑇 is obtain by operation of the form:

𝑇 = 𝑇[𝑥, 𝑦, 𝑝(𝑥, 𝑦), 𝑓(𝑥, 𝑦)] (3)

Where, 𝑥 and 𝑦 are the coordinates of the pixel in the image 𝑓, 𝑓 (𝑥, 𝑦) represents
the intensity of the pixel in these coordinates, p (x, y) is a property of the point to dis-
criminate. This threshold obtained by the result of the equation, you can define a binary
image 𝑔 (𝑥, 𝑦),
0 𝑠𝑖 𝑓(𝑥,𝑦)>𝑇
𝑔(𝑥, 𝑦) = { } (4)
1 𝑠𝑖 𝑓(𝑥,𝑦) ≤𝑇

Examining 𝑔(𝑥, 𝑦) it is possible to observe that the pixels assigned a value of 0 are
objects, while those with a value of 1 are the environment. In Fig 2, there is an example,
with a grayscale image, and its respective image segmented with different thresholds.
6

Fig. 2. Original image in grayscale, binarized image with threshold 𝑇 = 0.3 and with threshold
𝑇 = 0.6. Source: [5]

3.2 Edge Extraction


The edges of a digital image are transitions between gray levels in two different regions.
Which provide information on the edges or borders of objects, being use for image
segmentation, and object recognition, among others.
The edges of each region are different from the background, allowing their detection
based on sudden changes in the intensity level.
Edge detection techniques usually employ operators based on discrete approxima-
tions of the first and second derivatives of the gray levels of the image. The second
Derivative Gradient method, also known as the Laplacian Mask, is the strongest regard-
ing the detection of lines, therefore it is the most used[9].
The Laplacian filter allows the enhancement of linear features in environments, to
highlight the elements of greater variability is to subtract from the original image the
one obtained by a Laplacian filter (Fig.3)[6].

Fig. 3 Laplacian Edge Detector applying the second Derivative (◼ Laplacian: measures changes
in the gradient; ◼ Gradient difference). Source: [6]
7

3.3 Object Recognition Using Hough Transforms


The Hough transform allows you to detect curves in an image. This technique is good
against noise and gaps at the edge of the object. For its application, the image it’s first
binarized to obtain the edge of the object.
The Hough transform aims to find the aligned points in the image, to have the equa-
tion of the line, for the values of 𝜌 𝑎𝑛𝑑 𝜃. The equation of the line transformed to polar
is 𝑝 = 𝑥 ∗ cos 𝜃 + 𝑦 ∗ 𝑠𝑒𝑛𝜃. Then it is necessary to discretize the parameter space in
accumulation cells.
Where 𝜌 is the perpendicular distance from the origin to the line, and 𝜃 is the angle
formed between the perpendicular to the line and the horizontal axis, which will be
measured counterclockwise.
Therefore, each straight line can be associated with the parameters (𝜌, 𝜃) and this
parameter plane (𝜌, 𝜃) is the Hough space. [10]
Then the equation of the line is evaluating for each point in the image (𝑥𝑘 , 𝑦𝑘 ), it is
evaluated if the equation is satisfied and the value of the cell is increased by one. If the
values are high, this indicates that the point belongs to the line[11].

4 Artificial Neural Networks

Artificial Neural Networks (ANN) are computer systems that try to emulate the func-
tioning of the human brain, from the point of view of learning and adaptation. ANN
also called Distributed Processing Systems is a parallel process that is massively dis-
tributed and that stores experimental knowledge[12].
The ANNs have as main characteristics the learning that is through examples or
samples, and the interconnection weights "Synapse" that are adjust while learning
(Fig.4).

Fig. 4. Connected Artificial Neural Networks. Source: [12]


8

4.1 Artificial Neuron


The neuron is a fundamental information process unit in an ANN. It is a device that
allows calculating from an input vector with values from outside, or from other neurons
that provide a response or output, in Fig. 5, the image of a biological neuron and an
artificial neuron is observed[13].

Fig. 5. Biological Neuron and an Artificial Neuron. Source: [13]

In the artificial neuron model we have:


− Link Connection (parameters of synaptic weights 𝑊𝑗𝑛
if 𝑊𝑗𝑛 > 0 the connection is excitatory
if 𝑊𝑗𝑛 < 0 the connection is inhibitory;
− Sum Point (it is the weighted sum between the inputs and their synaptic
weights);
− Activation Function (non-linear transformation function, where the sum
value is transformed into a digital signal);
− and Polarization or Network Function (also known as threshold, it allows
to shift the value of the inputs).

4.2 Backpropagation algorithm


Depending on the ANN, there are several connection topologies and learning algo-
rithms, depending on their use. In this case, the model to be review is the Backpropa-
gation Model,[14]
Backpropagation model tries to combine several perceptron’s in a type of multilayer
network, and carry out the learning using a backward propagation algorithm, backward
from the error. The term refers to the calculation method on an error environment of a
network, forward.
Backpropagation is a supervised learning algorithm, which expects to know the ex-
pected output, and is associated with each input, updating the weights and gains,
through the rule of descending steps. To supervise the control of the error made, the
error function is redefine, leaving a new error function and is the following:

1
𝐸 (→) ≡ ∑𝑑 ∈ 𝐷 ∑ 𝑘 ∈ 𝑜𝑢𝑡𝑝𝑢𝑡𝑠(𝑡𝑘𝑑 − 𝑜𝑘𝑑 ) (5)
𝑊 2
9

Where each parameter represents: (→) Weights vector; 𝐷 Set of training weights; 𝑑
𝑊
Concrete training; 𝑂𝑢𝑡𝑝𝑢𝑡𝑠 Neurons vector output; 𝑘 Output neuron; 𝑡𝑘𝑑 Correct out-
put that the output neuron 𝑘 should give when applying the training example to the
network 𝑑; 𝑂𝑘𝑑 Output calculated by the output neuron 𝑘 when applying the training
example to the network 𝑑.[15]

4.3 Learning Process


For the learning process, the weights should be adjust through an interaction between
neurons and the environment. A neural network modifies its weights in response to
input information. The changes that occur during it are reduce to the destruction, mod-
ification and creation of connections between neurons[16]. In this context, in biological
systems there is a continuous destruction and creation of connections between neurons.
The RNA models for the creation of a new connection, its weight happens to have a
value other than zero. A connection is destroy when the weight equals zero. The Back-
propagation algorithm supports error propagation, providing an efficient method to cal-
𝛿𝐸
culate error derivatives 𝛿𝑦 , which converts the discrepancies between the derivative
output and the network output.

5 Implementation of the Object Recognition System

The Object Recognition System (ORS) was develop in the C # programming language
and with several Open Sources type libraries. Which allow obtaining the image through
a camera and process the images, making it a little easier to implement edge detection
algorithms, image segmentation as well as neuron training, so that according to the
captured image it has the ability to identify them through a system's own database.
For the processing of the images, the AForge.dll, AForge.Imaging.dll,
AForge.Math.dll library was used, which specializes in binary source code specialized
in Artificial Intelligence and Computer Vision, developed by Andrew Kirillov for the
.NET framework.
The NeuronDotNet.Core.dll library takes care of neuron training, which uses multi-
layer input and 2 layers output. Vijeth Dinesha develop this library as an Artificial In-
telligence project.
At the moment of execute the project, a screen will appear (Fig.6) where the different
options can be shown with which the program will start to carry out the processing of
the image using the binarization and detection of edges.
10

Fig. 6. ORS Image Obtain and Processing Screen.

Once the image treatment has done, then network process is perform, with which the
neuron is train using the Hough Moments, and the affine matrix which values are obtain
to perform the RGB Histogram (Fig. 7).

Fig. 7. ORS Neuron Training and obtaining the Weights and Related Values and Hough Moments
Screen.
11

To perform the training the Error Correction Learning algorithm is applied, with
which the weight adjustments are made, differentiating the types of leaves, and the
Hough moments; and as input parameters the number of iterations and the percentage
(%) of error tolerance.
Once the neuron is trained, the last step (Processing) is carried out, and the values of
the weights, Hough Moments, and the Affine are observed, with which the Histogram
of the image can be performed (Fig. 8).

Fig. 8. ORS Processing Screen, where the Weights, Hough values, and others are identified, in
addition to the Histogram.

Once the entire Neuron Training Process is perform, these data are save in a file,
with the values obtained. This file will help so that when a new process executed and
the values are similar, it will automatically identify the same image, appearing it on the
screen.
The result of the implementation of the Backpropagation Algorithm is the integra-
tion of the different processes of Processing Images and Learning by Correction of Er-
rors, which allows obtaining small output values, according to its structure and training
weights.
Backpropagation is a learning algorithm used in Multi Layers of Perceptron, being
use to solve complex problems in an efficient way, turning out to be more efficient than
traditional programs. Witch can be used in industry as part of Process Control, which
allows you to identify whether a manufactured product has a defect in its structure, with
the quality of the improved product allowing for the improvement of raw material.
12

Conclusions
There are several models and algorithms of Artificial Neural Networks, even more
efficient than the one implemented in this work, like the case of the Kohonen algorithm
that is used in many applications of transit, global positioning, among others.
Currently there are tools that perform facial recognition, which use their own
knowledge base making it more efficient, they also use more advanced algorithms such
as Hopfield, Kohonen.
Amazon has its recognition tool called Amazon Rekognition that is used for facial
recognition through mobile devices.
What makes artificial intelligence a tool used to improve times and processes in the
industry, since it can be monitored in real time, applying the necessary corrections, so
that it is not affected by any errors in production. The aim is to motivate the research
and implementation of this type of technology, not only for industry, but also in every-
day life, such as in intelligent buildings, thus giving a differential to this type of device.
The purpose of this work is to study the Backpropagation algorithm as a start towards
neural networks and their possible utilities, knowing that there are many learning algo-
rithms that are being used in other programming languages such as Python.
What is intended is that it can be applied in other programming languages and for
other uses.
We recommend that this type of research work, be use to motivate the development
and improvement of new projects, involving new methodologies of Artificial Intelli-
gence and Image Processing, since nowadays they are on the rise, thanks to their low
cost, and their easy learning. Thus, innovative projects with social benefit can be per-
form, such as facial recognition, which would help public security to recognize people
who have committed illicit acts, improving the quality of our society.

References
1. W. Rivas Asanza and B. Mazón Olivo, Redes neuronales artificiales aplicadas al
reconocimiento de patrones, 2016th ed. Machala: Editorial UTMACH.
2. G. Pajares Martinsanz, Conceptos y métodos en Visión por Computador. España: Grupo
de Visión del Comité de Automática (CEA), 2016.
3. F. PÉREZ, A. Félix, and J. L. GUERRA, “Internet de las Cosas,” Perspectiv@s, vol. 10,
pp. 45–46, 2017.
4. J. L. GONZALEZ GALVIS and J. A. PARRA ABRIL, “DISEÑO E
IMPLEMENTACIÓN DE UN SISTEMA DE RECONOCIMIENTO DE NARANJAS
PARA EL ROBOT GIO 1 USANDO VISIÓN ASISTIDA POR COMPUTADOR,”
2015.
5. R. C. Gonzalez and R. E. Woods, Digital Image Precessing. Chicago: Pearson, 2008.
6. D. E. R, Machine Vision: Theory, Algorithms, Practicalities. Oxford: Academic Press,
2015.
7. S. van der Walt, J. Schönberger, J. Nunez-Iglesias, and F. Boulogne, “scikit-image:
image processing in Python,” Peerj, 2014.
8. A. Ruiz C and S. Basualdo M, Redes Neuronales: Conceptos Básicos y Aplicaciones.
Rosario: PubliEditorial, 2001.
13

9. P. D. Wasserman, Neural Computing: Theory and Practice. Netherlands: Van Nostrand


Reinhold, 1989.
10. O. & M. Montavon, Neural Networks: Tricks of the Trade. Chicago: Springer, 2012.
11. H. Simon, Neural Netwoks and Learning Machines. Chicago: Prentice Hall, 2018.
12. L. P. Rouhiainen, Inteligencia Artificial. Barcelona: Editorial Planeta S.A., 2018.
13. R. Flores L and J. M. Fernández F, Las Redes Neuronales Artificiales: Fundamentos
Teóricos y Aplicaciones prácticas. La Coruña: NetBiblo, S.L, 2015.
14. C. Bishop M, Neural Networks for Pattern Recognition. Oxford University Press, 2006.
15. P. Guinot M and M. Ortí, INTRODUCCION A LAS REDES NEURONALES
APLICADAS AL CONTROL INDUSTRIAL. Valencia: Pearson, 2013.
16. B. González, F. Valdeza, P. Melina, and G. Prado-Arechiga, “Fuzzy logic in the
gravitational search algorithm for the optimization of modular neural networks in pattern
recognition,” ScienceDirect, pp. 5839–5847, 2015.

You might also like