0% found this document useful (0 votes)
5 views13 pages

Revisionback

The document discusses the implementation of an object recognizer using artificial neural networks and the backpropagation learning algorithm, emphasizing the role of digital image processing techniques such as segmentation and edge detection. It highlights the importance of proper lighting and image processing methods, including binarization and edge extraction, for effective object recognition. Additionally, it covers the fundamentals of artificial neural networks, including their structure and the backpropagation algorithm used for learning and adaptation.

Uploaded by

pruebasepn434
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views13 pages

Revisionback

The document discusses the implementation of an object recognizer using artificial neural networks and the backpropagation learning algorithm, emphasizing the role of digital image processing techniques such as segmentation and edge detection. It highlights the importance of proper lighting and image processing methods, including binarization and edge extraction, for effective object recognition. Additionally, it covers the fundamentals of artificial neural networks, including their structure and the backpropagation algorithm used for learning and adaptation.

Uploaded by

pruebasepn434
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

Implementation of an Object Recognizer Through Image

Processing and Backpropagation Learning Algorithm

Abstract. Artificial neural networks in an effort to emulate the operation of the


human brain from the point of view of learning and adaptation, have evolved in
such a way that different statistical and mathematical models have inspired
biological models, example you have to nerve cells or better known as neurons;
the same ones that are composed of dendrites which are responsible for captur-
ing the nerve impulses emitted by other neurons. The present study aims to
analyze the Backpropagation model and the multilayer topology using an object
recognizer through digital image processing such as object segmentation by
detection and edge determination.

Keywords: Backpropagation, Neural Network, Synapse, RNA, Digital Imag-


ing, Pixel, Machine Vision, Perceptron, Optics.

1 Introduction

Technological advancement has evolved and incorporated into different modern de-
vices and into people's daily lives. As is the case of capturing better photography’s, or
the increasingly use of digital assistants on a day-to-day basis. Consequently, people
do not skimp on expenses when it comes to using all the technology that exists in
them.
Artificial Intelligence (AI) has become a trend, due to its applicability in a large
number of devices for daily use.
Artificial Neural Networks emulate the functioning of the human brain, formed by
a group of nodes also known as artificial neurons that connect and transmit signals to
each other. The signals transmitted from the input to producing the output and whose
primary objective of said model will be to learn with the continuous and automatic
modification in itself, in such a way that it can perform complex tasks that not be
achieved by means of classical programming that it is based on rules. With this it be
possible to automate functions that previously could only be performed by human
beings [1].
The Artificial Vision or Computer Aided Vision has also been developing along-
side, consisting of the automatic deduction of the structure and properties of a three-
dimensional, possibly dynamic world, from one or several two-dimensional images of
this world. In this area of knowledge, there are concepts of color physics, optics, elec-
tronics, geometry, algorithms, and computer systems [2]. Computer Vision (CV) has
also been developing alongside AI and Neural Networks, through the analysis of im-
ages by computers. CV has been implemented in the field of industry on two funda-
mental fronts: obtaining the greatest interaction between machines and their environ-
ment; and achieve total quality control in the products that are manufactured [2].
2

The development of new algorithms and the invention of new digital cameras has
been the beginning in which the spectrum of applications has expanded in recent
years. As is the case of new digital cameras and photographic applications, which can
be detected by means of facial recognition the face of the person in the image.
This type of technological advance has opened up new fields of application in in-
dustry such as, security, automatic and autonomous driving of vehicles, new ways of
interacting with controllers through gestures with the hand the movement of eyes and
the growing boom Internet of Things (IoT) [3].
Artificial vision consists of the automatic deduction of the structure and properties
of a three-dimensional world, possibly dynamic, from one or more two-dimensional
images of that world. In this area of knowledge, concepts of color physics, optics,
electronics, geometry, algorithmic, and computer systems are combined [4].

2 Images

An image is a two-dimensional representation of a three-dimensional world scene.


The image is the result of the acquisition of a signal provided by a sensor, which con-
verts the information from the electromagnetic spectrum into numerical encodings [5].
Based on this, the transformation in the chosen image representation format is part of
a discrete information not only in the used values but also in the parameters that de-
fine it.
Without this representation, in a generic way, a digital image can be defined as a
matrix of NxM dimensions, where each element of this matrix contains a discrete
value that quantifies the information level of the corresponding element, represented
with a finite number of bits ( q ). Thus, the concept or image criterion can be expressed
as a discrete two-dimensional function in the following way:

𝐼 (𝑥, 𝑦) / 0 <= 𝑥 <= 𝑁 − 1, 0 <= 𝑦 <= 𝑀 – 1 (1)

Where N and M can be any numerical value within the natural numbers and the
values of each element must be multiples of 2:

0 <= 𝐼 (𝑥, 𝑦) <= 𝑝 − 1 / 𝑝 = 2! (2)

The values contained in each of the elements of the image i represented from dark
levels of luminosity to the lightest values. The darkest level corresponds to the lowest
value of the interval represented by the color black, and the lightest level represented
by the color white and corresponds to the highest value.
The image is a two-dimensional function that provides certain electromagnetic
information for each of its values. Each of these discrete elements is a point or pixel
and generally contains the lighting level, or the color of a point in the scene. The set
of points or pixels form the image or photograph of the scene. The image comes from
the spectral representation received by a sensor. Being the color generated by the
superposition of three spectral components [5].
3

2.1 Lighting

During the design of artificial vision systems, uncontrolled lighting of the environ-
ment is usually not acceptable since images with low contrast, specular reflections,
being shadows and flares obtained. A well-designed lighting system provides light to
the scene in such a way that the image obtained favor the subsequent process on it,
maintaining and even improving the information necessary for the detection and ex-
traction of objects and characteristics of interest [6].
Two types of lighting can be consider, those that can affect image processing (nat-
ural light such as the sun, moon, and stars) and artificial lights. Artificial are the most
commonly used in artificial vision, since these can be adjusted depending on the
physical environment that surrounds it; the same one that produces the transformation
of electrical energy into photons and produces the different types of lighting with the
most common and their own characteristics. That define a type of lighting, which is
based on the range of wavelengths emitted by the light, the durability at the time of
the wavelength and the variations produced over time and the temperature that can be
reach in that instant [6].

2.2 Lighting Techniques

The proper functioning of a vision application is fundamentally dependent on light-


ing. If adequate lighting is not used, problems in contrasts, brightness and shadows
can occur, which make the inspection algorithm difficult, and if this is the case, there
may be instances where the algorithm cannot find a solution.
If the intention is to improve the vision system, it is necessary to use an adequate
lighting technique, which will be a determining factor in obtaining a correct image to
be processed. An example is an image in which the pixels that represent the objects of
interest in it, have certain characteristics of similar luminosity, and are very different
from the pixels that do not represent other objects of interest. Appropriate lighting is
critical for correct image, which do not appear saturated areas or shadows that hide
information within the image. Shadows cause false edge detections resulting in mea-
sures of either incorrectly [7].
In this context, weak lighting can result in a low signal/noise ratio, which can lead
to a quality image with noisy pixels.
To choose a good lighting will be necessary to determine the actions of each com-
ponent of the vision system at the time of image capture. Each of the components
influences the amount of light that reaches the sensor, and therefore this will depend
on the quality of the captured image. The aperture of the optics diaphragm directly
affects the amount of light reaching the sensor. If the diaphragm is closed, the amount
of light coming from the scene must be increase, or the exposure time must be in -
crease, if the aim is to achieve an image with the same brightness values. A small area
reflects less light than a large area and as consequence its necessary take into account
the point of view of lighting. In this sense, the advantages and disadvantages of light-
ing techniques should be consider (Table 1) [8].
4

Table 1. Advantages and Disadvantages of the different Lighting Techniques.

Technique Description Disadvantages Advantages


Directional It is directed to the object Low price Brightness
(Fig.1)
Diffuse It is directed at the entire Low brightness Not useful in tight
work area places

Backlighting It is directed at the Image edges are sharp Image details are not
shadow of the image visible to the naked eye

Oblique It is directed only at the The shadow of the It only focuses on a


object obliquely image will be skewed single object
by the light
Dark-Ground It is directed with trans- Useful for highlighting The contrast at the
(Fig,1) parent materials cracks and bubbles edges is decreased
within an object.
Coaxial It is directed with illumi- The edges are well The surface is not well
nation through a mirror defined defined

Structured It is addressed both in Has a visibility of the Loss of color distinc-


visible and non-visible entire object tion.
space

Figure 1 shows an example of images to which the different lighting techniques


applied, fin this case direct lighting and dark-ground lighting.

Fig. 1. Image capture applying Direct lighting (Left) and Dark-Ground lighting (Right). Source:
[2]

3 Image processing

Image processing is a set of techniques that facilitates the search for information from
an image or object, through information extraction techniques such as optical methods
or by digital means such as the computer, applying different mathematical models,
algorithms, among others.
This type of processing uses the Pixel, which is the physical unit of color that is
part of the digital image; the images are formed by the succession of pixels, which
5

presents a coherence in the information that is displayed, constituting an information


matrix.
Digital image processing is performed by dividing the image into a matrix of val-
ues or pixels; it is here where a numerical value is assigned to the average brightness
of each pixel and the coordinates of its position, resulting in an image definition com-
plete.
There are several mathematical models that allow the representation of colors in
numerical form, using chromatic values: RGB (Red, Green, Blue) and CMYK (Cyan,
Magenta, Yellow, Key), which associates a numeric vector in a space color.

3.1 Binarization

Binarization is the process to move to simpler colors, which in this case is black and
white. This technique allows reference different regions of the image, which have a
similar intensity distribution, making the gray scale histogram show which areas be-
long to what regions. Threshold binarization is a technique that allows obtaining as
much information as possible, since it identifies a threshold as an intensity value from
a certain group of pixels, these will be considered within another subgroup and classi-
fied as white, while the other subgroup will be classified black.
For the detection of the threshold, the histogram of the image is essential, since it
defines the intensity levels by means of their relative frequency.
Assuming that the objects have intensity values greater than those of the environ-
ment (Fig.2), binarization is performed by whether a pixel ( x , y ) meets the current
value of that pixel in the image f , is greater than the threshold, f (x, y)> T , then that
pixel will be considered as the object, otherwise, this point will be considered from
the environment [8].
Explicitly, the threshold value T is obtain by operation of the form:

T =T [x , y , p ( x , y ) , f ( x , y ) ] (3)

Where, x and y are the coordinates of the pixel in the image f , f ( x , y ) represents
the intensity of the pixel in these coordinates, p (x, y) is a property of the point to
discriminate. This threshold obtained by the result of the equation, you can define a
binary image g ( x , y ),

g ( x , y )=
{ 0 si f ( x , y ) >T
1 si f ( x , y ) ≤T } (4)

Examining g ( x , y ) it is possible to observe that the pixels assigned a value of 0


are objects, while those with a value of 1 are the environment. In Fig 2, there is an
example, with a grayscale image, and its respective image segmented with different
thresholds.
6

Fig. 1. Original image in grayscale, binarized image with threshold T =0.3 and with thresh-
old T =0.6 . Source: [5]

3.2 Edge Extraction

The edges of a digital image are transitions between gray levels in two different re -
gions. Which provide information on the edges or borders of objects, being use for
image segmentation, and object recognition, among others.
The edges of each region are different from the background, allowing their detec-
tion based on sudden changes in the intensity level.
Edge detection techniques usually employ operators based on discrete approxima-
tions of the first and second derivatives of the gray levels of the image. The second
Derivative Gradient method, also known as the Laplacian Mask, is the strongest re-
garding the detection of lines, therefore it is the most used [9].
The Laplacian filter allows the enhancement of linear features in environments, to
highlight the elements of greater variability is to subtract from the original image the
one obtained by a Laplacian filter (Fig.3) [6].

Fig. 1 Laplacian Edge Detector applying the second Derivative ( Laplacian: measures
changes in the gradient;  Gradient difference). Source: [6]
7

3.3 Object Recognition Using Hough Transforms

The Hough transform allows you to detect curves in an image. This technique is good
against noise and gaps at the edge of the object. For its application, the image it’s first
binarized to obtain the edge of the object.
The Hough transform aims to find the aligned points in the image, to have the
equation of the line, for the values of ρ y θ. The equation of the line transformed to
polar is p=x∗cos θ+ y∗senθ. Then it is necessary to discretize the parameter

Where 𝜌 is the perpendicular distance from the origin to the line, and 𝜃 is the an-
space in accumulation cells.

gle formed between the perpendicular to the line and the horizontal axis, which will
be measured counterclockwise.
Therefore, each straight line can be associated with the parameters (𝜌,𝜃) and this
parameter plane (𝜌,𝜃) is the Hough space.[10].
Then the equation of the line is evaluating for each point in the image (x k , y k ), it
is evaluated if the equation is satisfied and the value of the cell is increased by one. If
the values are high, this indicates that the point belongs to the line [11].

4 Artificial Neural Networks

Artificial Neural Networks (ANN) are computer systems that try to emulate the func-
tioning of the human brain, from the point of view of learning and adaptation. ANN
also called Distributed Processing Systems is a parallel process that is massively dis -
tributed and that stores experimental knowledge [12].
The ANNs have as main characteristics the learning that is through examples or
samples, and the interconnection weights "Synapse" that are adjust while learning
(Fig.4).

Fig. 1. Connected Artificial Neural Networks.


8

4.1 Artificial Neuron

The neuron is a fundamental information process unit in an ANN. It is a device that


allows calculating from an input vector with values from outside, or from other neu-
rons that provide a response or output, in Fig. 5, the image of a biological neuron and
an artificial neuron is observed [13].

Fig. 1. Biological Neuron and an Artificial Neuron.

In the artificial neuron model we have:


 Link Connection (parameters of synaptic weights W jn
if W jn > 0 the connection is excitatory
if W jn < 0 the connection is inhibitory;
 Sum Point (it is the weighted sum between the inputs and their synaptic
weights);
 Activation Function (non-linear transformation function, where the sum
value is transformed into a digital signal);
 and Polarization or Network Function (also known as threshold, it allows
to shift the value of the inputs).

4.2 Backpropagation algorithm


Depending on the ANN, there are several connection topologies and learning algo-
rithms, depending on their use. In this case, the model to be review is the Backpropa-
gation Model, [14]
Backpropagation model tries to combine several perceptron’s in a type of multi-
layer network, and carry out the learning using a backward propagation algorithm,
backward from the error. The term refers to the calculation method on an error envi-
ronment of a network, forward.
Backpropagation is a supervised learning algorithm, which expects to know the
expected output, and is associated with each input, updating the weights and gains,
through the rule of descending steps. To supervise the control of the error made, the
error function is redefine, leaving a new error function and is the following:
9

1
E (⃗
w ) ≡ ∑ d ∈ D ∑ k ∈outputs (t kd −o kd ) (5)
2

( )

Where each parameter represents: W Weights vector; D Set of training weights; d
Concrete training; Outputs Neurons vector output; k Output neuron; t kd Correct
output that the output neuron k should give when applying the training example to the
network d ; O kd Output calculated by the output neuron k when applying the training
example to the network d . [15]

4.3 Learning Process

For the learning process, the weights should be adjust through an interaction between
neurons and the environment. A neural network modifies its weights in response to
input information. The changes that occur during it are reduce to the destruction,
modification and creation of connections between neurons [16]. In this context, in
biological systems there is a continuous destruction and creation of connections be-
tween neurons.
The RNA models for the creation of a new connection, its weight happens to have
a value other than zero. A connection is destroy when the weight equals zero. The
Backpropagation algorithm supports error propagation, providing an efficient method
δE
to calculate error derivatives , which converts the discrepancies between the de-
δy
rivative output and the network output.

5 Implementation of the Object Recognition System

The Object Recognition System (ORS) was develop in the C # programming lan-
guage and with several Open Sources type libraries. Which allow obtaining the image
through a camera and process the images, making it a little easier to implement edge
detection algorithms, image segmentation as well as neuron training, so that according
to the captured image it has the ability to identify them through a system's own data-
base.
For the processing of the images, the AForge.dll, AForge.Imaging.dll, AForge.-
Math.dll library was used, which specializes in binary source code specialized in Arti-
ficial Intelligence and Computer Vision, developed by Andrew Kirillov for the .NET
framework.
The NeuronDotNet.Core.dll library takes care of neuron training, which uses mul-
tilayer input and 2 layers output. Vijeth Dinesha develop this library as an Artificial
Intelligence project.
At the moment of execute the project, a screen will appear (Fig.6) where the differ -
ent options can be shown with which the program will start to carry out the processing
of the image using the binarization and detection of edges.
10

Fig. 1. ORS Image Obtain and Processing Screen.

Once the image treatment has done, then network process is perform, with which
the neuron is train using the Hough Moments, and the affine matrix which values are
obtain to perform the RGB Histogram (Fig. 7).

Fig. 1. ORS Neuron Training and obtaining the Weights and Related Values and Hough Mo -
ments Screen.
11

To perform the training the Error Correction Learning algorithm is applied, with
which the weight adjustments are made, differentiating the types of leaves, and the
Hough moments; and as input parameters the number of iterations and the percentage
(%) of error tolerance.
Once the neuron is trained, the last step (Processing) is carried out, and the values
of the weights, Hough Moments, and the Affine are observed, with which the His-
togram of the image can be performed (Fig. 8).

Fig. 1. ORS Processing Screen, where the Weights, Hough values, and others are identified, in
addition to the Histogram.

Once the entire Neuron Training Process is perform, these data are save in a file,
with the values obtained. This file will help so that when a new process executed and
the values are similar, it will automatically identify the same image, appearing it on
the screen.
The result of the implementation of the Backpropagation Algorithm is the integra-
tion of the different processes of Processing Images and Learning by Correction of
Errors, which allows obtaining small output values, according to its structure and
training weights.
Backpropagation is a learning algorithm used in Multi Layers of Perceptron, being
use to solve complex problems in an efficient way, turning out to be more efficient
than traditional programs. Witch can be used in industry as part of Process Control,
which allows you to identify whether a manufactured product has a defect in its struc-
ture, with the quality of the improved product allowing for the improvement of raw
material.
12

Conclusions
There are several models and algorithms of Artificial Neural Networks, even more
efficient than the one implemented in this work, like the case of the Kohonen algo-
rithm that is used in many applications of transit, global positioning, among others.
There are several tools that perform facial recognition, which use their own knowl-
edge base making it more efficient, they also use more advanced algorithms such as
Hopfield, Kohonen.
Amazon has its recognition tool called Amazon Rekognition that is used for facial
recognition through mobile devices.
What makes artificial intelligence a tool used to improve times and processes in the
industry, since it can be monitored in real time, applying the necessary corrections, so
that it is not affected by any errors in production. The aim is to motivate the research
and implementation of this type of technology, not only for industry, but also in ev-
eryday life, such as in intelligent buildings, thus giving a differential to this type of
device.
The purpose of this work is to study the Backpropagation algorithm as a start to-
wards neural networks and their possible utilities, knowing that there are many learn-
ing algorithms that are being used in other programming languages such as Python.
What is intended is that it can be applied in other programming languages and for
other uses.
We recommend that this type of research work, be use to motivate the development
and improvement of new projects, involving new methodologies of Artificial Intelli-
gence and Image Processing, since nowadays they are on the rise, thanks to their low
cost, and their easy learning. Thus, innovative projects with social benefit can be per-
form, such as facial recognition, which would help public security to recognize people
who have committed illicit acts, improving the quality of our society.

References
1. W. Rivas Asanza and B. Mazón Olivo, Redes neuronales artificiales aplicadas al re-
conocimiento de patrones, 2016th ed. Machala: Editorial UTMACH.
2. G. Pajares Martinsanz, Conceptos y métodos en Visión por Computador. España:
Grupo de Visión del Comité de Automática (CEA), 2016.
3. F. PÉREZ, A. Félix, and J. L. GUERRA, “Internet de las Cosas,” Perspectiv@s, vol.
10, pp. 45–46, 2017.
4. J. L. GONZALEZ GALVIS and J. A. PARRA ABRIL, “DISEÑO E
IMPLEMENTACIÓN DE UN SISTEMA DE RECONOCIMIENTO DE NARANJAS
PARA EL ROBOT GIO 1 USANDO VISIÓN ASISTIDA POR COMPUTADOR,”
2015.
5. R. C. Gonzalez and R. E. Woods, Digital Image Precessing. Chicago: Pearson, 2008.
6. D. E. R, Machine Vision: Theory, Algorithms, Practicalities. Oxford: Academic Press,
2015.
7. S. van der Walt, J. Schönberger, J. Nunez-Iglesias, and F. Boulogne, “scikit-image:
image processing in Python,” Peerj, 2014.
13

8. A. Ruiz C and S. Basualdo M, Redes Neuronales: Conceptos Básicos y Aplicaciones.


Rosario: PubliEditorial, 2001.
9. P. D. Wasserman, Neural Computing: Theory and Practice. Netherlands: Van Nostrand
Reinhold, 1989.
10. O. & M. Montavon, Neural Networks: Tricks of the Trade. Chicago: Springer, 2012.
11. H. Simon, Neural Netwoks and Learning Machines. Chicago: Prentice Hall, 2018.
12. L. P. Rouhiainen, Inteligencia Artificial. Barcelona: Editorial Planeta S.A., 2018.
13. R. Flores L and J. M. Fernández F, Las Redes Neuronales Artificiales: Fundamentos
Teóricos y Aplicaciones prácticas. La Coruña: NetBiblo, S.L, 2015.
14. C. Bishop M, Neural Networks for Pattern Recognition. Oxford University Press, 2006.
15. P. Guinot M and M. Ortí, INTRODUCCION A LAS REDES NEURONALES
APLICADAS AL CONTROL INDUSTRIAL. Valencia: Pearson, 2013.
16. B. González, F. Valdeza, P. Melina, and G. Prado-Arechiga, “Fuzzy logic in the gravi-
tational search algorithm for the optimization of modular neural networks in pattern
recognition,” ScienceDirect, pp. 5839–5847, 2015.

You might also like