Artificial neural network
And
Image processing
Presented by
Ashwini G S
Jyothi K N
Madhushree M
Artificial Neural Network:
A neural network is a group of connected I/O units where each connection
has a weight associated with its computer programs.
It helps you to build predictive models from large databases.
This model builds upon the human nervous system.
It helps you to conduct image understanding, human learning, computer
speech, etc.
The neural network itself may be used as a piece in many machine learning
algorithms to process complex data inputs into a space that computers can
understand.
Neural networks are being applied to many real-life problems today,
including speech and image recognition, spam email filtering, finance, and
medical diagnosis etc.
Architecture of ANN with single neuron
The McCulloch-Pitts Neuron- Man Kind’s First Mathematical Model Of A
Biological Neuron
It is the first mathematical model of a neuron [Warren McCulloch and Walter Pitts, 1943]
The McCulloch-Pitts neural model is also known as linear threshold gate.
It is a neuron of a set of inputs I1, I2, I3 ...In and one output.
The linear threshold gate simply classifies the set of inputs into two different classes. Thus the
output is binary. Such a function can be described mathematically using these equations.
N
Sum = ∑ I W
i=1
i i y = f (Sum)
W1, W2, W3... Wm are weight values normalised in the range of either (0, 1) or (-1, 1) and
associated with each input line, sum is the weighted sum, and T is a threshold constant. The
function f is a linear step function at threshold T as shown in figure.
The McCulloch-Pitts model of a neuron is simple yet has substantial computing potential. It has
a precise mathematical definition. However, this model is so simplistic that it only generates a
binary output and also weight and threshold values are fixed. The neural computing algorithm
has diverse features for various applications. Thus, we need to obtain the neural model with
more flexible computational features.
Perceptron:
A perceptron is an algorithm used for supervised learning of binary classifiers.
Binary classifiers decide whether an input, usually represented by a series of vectors
belongs to a specific class.
In short, a perceptron is a single-layer neural network.
They consist of four main parts including input values, weights and bias, net sum
and activation function.
How does a perceptron works?
The process begins by taking all the input values and multiplying them by their weights.
Then, all of these multiplied values are added together to create the weighted sum.
The weighted sum is then applied to the activation function, producing the perceptron’s
output.
The activation function plays the integral role of ensuring the output is mapped between
required values such as (0, 1) or (-1, 1). It is important to note that the weight of an input is
indicative of the strength of a node.
Similarly, an input’s bias value gives the ability to shift the activation function curve up or
down.
Multilayer Perceptrons:
The perceptron is very useful for classifying datasets that are linearly separable. They
encounter serious limitations with datasets that do not conform to this pattern as discovered
with the XOR problem.
The XOR problem shows that for any classification of four points that there exists a set that
are not linearly separable.
The multi-layer perceptron breaks this restriction and classifies datasets which are not linearly
separable.
They do this by using a more robust and complex architecture to learn regression and
classification models for difficult datasets.
How does a multilayer perceptron works?
The perceptron consists of an input layer and an output layer which are fully connected. MLPs
have the same input and output layers but may have multiple hidden layers in between the
before mentioned layers.
Feed forward neural network
A feed forward neural network is an artificial neural network in which the connection
between nodes does not form a cycle.
The opposite of a feed forward neural network is a recurrent neural network, in which certain
pathways are cycled.
The feed forward model is the simplest form of neural network as information is only
processed in one direction.
While the data may pass through multiple hidden nodes, it always moves in one direction and
never backwards.
Back propagation
Back propagation is the essence of neural net training.
It is the method of fine-tuning the weights of a neural net based on the error rate obtained in
the previous epoch(i.e., iteration)
Proper tuning of the weights allows you to reduce error rates and to make the model reliable
by increasing its generalization.
Back-Propagation is a short form for “Back propagation of errors”.
It is a standard method of training artificial neural networks.
This method helps to calculate the gradient of a loss function with respects to all the weights
in the network.
How back propagation works
1. Inputs X, arrive through the pre-connected path.
2. Input is modelled using real weights W. The weights are usually randomly selected.
3. Calculate the output for every neuron from the input layer, to the hidden layers, to the output
layer.
ERROR = Actual Output – Desired Output
5. Travel back from the output layer to the hidden layer to adjust the weights such that the
error is decreased.
Keep repeating the process until the desired output is achieved.
Hidden Layer
In neural networks, a hidden layer is located between the input and output of the algorithm, in
which the function applies weights to the inputs and directs them trough an activation
function as the output.
In short, the hidden layers perform nonlinear transformations of the inputs entered into the
network.
How does a Hidden Layer work?
Hidden Layers are layers of mathematical functions each designed to produce an output specific
to an intended result.
For example, some forms of hidden layers are known as squashing functions.
These functions are particularly useful when the intended output of the algorithm is a
probability because they take an input and produce an output value between 0 and 1, the range
for defining probability.
Hidden layers allow for the function of a neural network to be broken down into specific
transformations of the data.
Each hidden layer function is specialized to produce a defined output.
Terminologies of artificial neural network
1. Bias : Bias is like a intercept added in a linear equation
It is an additional parameter in the neural network which is used to adjust the output along
with the weighted sum of inputs to the neuron.
Bias is a constant which helps the model to fit best for the given data.
Output = sum (weight * inputs) + Bias
Example: consider equation y= mx+c here m is acting as weight and constant c is acting as
bias.
Due to absence of bias, model will train point point passing through origin only. Also with
the introduction of bias, the model become flexible.
Bias units just appended to the start/ end of the input and each hidden layer and is not
affected by the values in the previous layer.
Image Processing
What is Image Processing??
It is a method to perform some operations on an image, in order to get an
enhanced image or to extract some useful information from it.
It is a type of signal processing in which input is an image and output maybe
image or features associated with that image.
Why do we need to process the image??
It is motivated by two major applications-
Improvement of pictorial information for human perception.
Image processing for autonomous machine application.
Efficient storage and transmission.
Image: An array or a matrix of multiple pixels which are properly arranged in
columns and rows.
Pixel: Fundamental component of an image. An image is fully composed of pixels.
Types of image
Binary Image: binary is nothing but 0’s and 1’s.
It contains only 2 colours. One is white and other is black.
In binary image, each pixel needs only one bit storage.
Black and white image:
Each pixel needs 8 bit storage space.
It gives enriched quality of image.
Gray scale image:
It is a special image which has range of shades from the black to the white colour.
The range of this shade varies from 0-255. Where 0 stands for black and 255 stands
for white and in between are the different shades.
Colour image: Each pixel in the colour image shall be having colour information.
Each pixel is composed of 3 channels and are most commonly regarded as red,
green, blue (RGB).
Each of these channels needs 8 bits for storage and hence totally, it would become
24 bits for each pixel.
The shade of each of the pixel would vary based on the intensity of R or G or B.
Steps in image processing:
Image Image Color image Image
enhancemen restoration processing segmentatio
t n
Image
Image pre- representation
processing and description
Knowledge base
Object
Image recognition
acquisition
Step1.Image acquisition: This is the first and foremost step. Here, we acquire the image.
Most of the time the acquired image shall be digital when it comes out from the camera, if
not, it should be converted as a digital image with the help of ADC.
Step2.Image enhancement: This is one of the most important steps in the entire work
flow.
If this step is done well, rest of the steps shall be good and shall meet the expectation.
Major focus of this step is to make sure that the image is good to be processed and as all that
what is needed to enable further processing.
Here, the smoothening, sharpening, increasing/decreasing brightness, adjusting contrast
etc...all are carried out which would eventually facilitate the rest of the steps to come. Image
enhancement methods can be based on either spatial or frequency domain techniques.
Spatial domain enhancement methods
Spatial domain techniques directly deal with image pixels. The pixel values are manipulated
to achieve desired enhancement.
Spatial domain techniques are particularly useful for directly altering the gray level values of
individual pixels.
The operation can be formulated as g(x, y) = T[f(x, y)], where g is the output, f is the input
image and T is an operation on f defined over some neighbourhood of (x, y).
According to the operations on the image pixels, it can be further divided into 2 categories:
point operations and spatial operations (including linear and non-linear operations).
The use of spatial masks for image processing is called spatial filtering.
The masks used are called spatial filters.
The basic approach is to sum products between the mask coefficients and the intensities of the
pixels under the mask at a specific location in the image (2D convolution).
d d
R(x, y) = ∑ ∑ w ( i , j ) f ( x−i , y− j)
i=−d −d
Where (2d+1) X(2d+1)is the mask size, w( i, j)’s are weights of the mask, f(x, y) is input pixel at
coordinates (x, y), R(x, y) is the output value at(x, y).
If the center of the mask is at location (x, y) in the image, the gray level of the pixel located at
(x, y) is replaced by R, the mask is then moved to the next location in the image and the
process is repeated. This continues until all pixel locations have been covered.
a) Smoothing filter: Smoothing filter are used for blurring and for noise reduction.
Blurring is used in pre-processing steps, such as removal of small details from an image prior
to object extraction, bridging of small gaps in lines or curves.
Noise reduction can be accomplishing by blurring with a linear filter and also by non linear
filtering.
a.1) low pass filtering: The key requirement is that all coefficients are positive
Neighbourhood averaging is a special case of LPF where all coefficients are equal.
It blurs edges and other sharp details in the image.
1 1 1
[ ]
Example: 1/9 1 1 1
1 1 1
a.2) Median filtering: If the objective is to achieve noise reduction instead of blurring, this
method should be used.
This method is particularly effective when the noise pattern consists of strong, spike-like
components and the characteristic to be preserved is edge sharpness.
It is a non linear operation.
For each input pixel f(x, y), we sort the values of the pixel and its neighbours to determine
their median and assign its value to output pixel g(x, y).
A.3) sharpening filters: To highlight fine detail in an image or to enhance detail that has been
blurred, either in error or as a natural effect of a particular method of image acquisition
Uses of image sharpening vary and include applications ranging from electronic printing and
medical imaging to industrial inspection and autonomous target detection in smart weapons.
(a) Basic highpass spatial filter
Shape of the impulse response needed to implement a high pass spatial filter indicates that the
filter should have positive coefficients near its center, and negative coefficients in the outer
periphery.
−1 −1 −1
Example: filter mask of a 3x3 sharpening filter 1/9 [ −1 8 −1
−1 −1 −1 ]
The filtering output pixels might be of a gray level exceeding [0, L-1].
The results of highpass filtering involve some form of scaling and/or clipping to make sure
that the gray
levels of the final results are within [0,L-1]
(b)Derivative filters
Differentiation can be expected to have the opposite effect of averaging, which tends to blur
detail in an image, and thus sharpen an image and be able to detect edges.
The most common method of differentiation in image processing applications is the gradient.
For a function f(x, y), the gradient of f at coordinates(x', y') is defined as the vector
∂f
[]
∇ f ( x ' , y ' )= ∂ x
∂f
∂y ( x' , y')
Its magnitude can be approximated in a number of ways, which result in a number of
operators such as Roberts, Prewitt and Sobel operators for computing its value.
Enhancement in frequency domain:
We simply compute the Fourier transform of the image to be enhanced, multiply the result by
a filter transfer function, and take the inverse transform to produce the enhanced image.
Spatial domain: g(x, y)=f(x, y)*h(x, y)
Frequency domain: G(w1,w2)=F(w1,w2)H(w1,w2)
Low pass filtering:
Edges and sharp transitions in the gray levels
contribute to the high frequency content of its
Fourier transform, so a low pass filter smoothes an
image.
Formula of ideal LPF
1if D ( u , v ) ≤ D
H(u, v) ={ 0
0 else
High pass filtering:
A high pass filter attenuates the low frequency components without disturbing the high
frequency information in the Fourier transform domain can sharpen edges.
Formula of ideal HPF function
0 if D (u , v ) ≤ D
H(u, v) = { 0
1 else
Step3.Image restoration: It helps in improving the appearance of the image.
It is actually a step which could undo the defects in the image (degraded image).
Degradation may also be in the form of noise, blur etc... and restoration helps in restoring the
image with better quality.
Step4.Color Image Processing: In this step, the color information (R, G and B) could
be used to extract and to understand the features from the image.
The features which normally one can think of are color, texture, shape, structure etc... and this
would enable the user to understand the image better for meaningful processing .
Step5.Image segmentation:
It is the process through which a digital image shall be partitioned into multiple segments.
This segmenting shall enable the user to identify the objects and to extract more meaningful
information from the image.
It divides the image into multiple segments and extracts the best out of it!
Segmentation can be done based on texture, greyscale, motion, depth etc.
Segmentation is unsupervised learning.
Image segmentation algorithms can be classified into two classes :
1) Global segmentation algorithms
2) Local segmentation algorithms
In global segmentation algorithm, we dealt with whole image as a one. Global segmentation
algorithms are used when object in image occupy large part of the image. So, we can use
single parameter to segment the image. In local segmentation algorithm, object(s) are of small
size, or they are in large number, so we have to vary the parameter value for different-
different regions. Image segmentation algorithms can be approach from many different
philosophical perspectives. Some of them are given below.
a).Thresholding base Image Segmentation
Image segmentation based on a thresholding is the simplest technique. In this technique we
set a threshold value, pixel lying above (or below) can be classify as object and pixel lying
below (or above) can be classify as background .
This technique converts a gray scale image into binary image.
This technique will give good result if background and object has large variation in their
intensity value.
The disadvantage is that it will not be able to identify multiple objects. To identify multiple
object, we use multiple thresholds are required.
Multiple threshold can be found by using a statistical recursive algorithm. Which use mean
and variance to segmenting an image into multi-level. Multiple thresholds are useful in
dealing with colored images or images having complex background where single threshold
algorithm can't work.
b).Region based image segmentation:
Region based segmentation can be done in two ways:
1) Region Growing: Region growing is simplest in region base image segmentation techniques
In this technique, a seed point is chosen at random, then neighbouring pixels are check, with
some criterion, to determine whether those neighbouring pixels are added to the initial
seed points or not.
It is an iterative process, So we need to specify a stop condition. Generally, the iterative
process runs until there is no change in seed point in two successive iterations. We use 4-
connected neighbourhood to grow the neighbouring.
Region growing algorithm:
a) Choose the seed points.
b) If the neighbouring pixels of the initial seed points are satisfy the criteria such as threshold,
they will be added to the seed point.
2) Data clustering: Data clustering method initially assume whole image as a single cluster
and then use mathematics and statistics to create number of clusters within the image.
To reduce computational complexity and running time, centroid approach is used. Two types
of clustering are possible: hierarchical clustering and partitional clustering.
In the hierarchical clustering, we can change the numbers of cluster during the process.
However, in the partitional clustering, we must decide the numbers of cluster before
processing.
Edge base image segmentation:
Edge based segmentation techniques are first find the edges by using different-different
operators.
Since an object can be represented by its edges. So we can segment the image by simply
finding edges in the image.
A typical approach to segmentation using edges is
1) compute an edge image from original image,
2) process the edge image for broken edges,
3) transform the result to an ordinary segmented image by filling in the object boundaries.
The first and third steps are simple, the problem lies in second step: transforming an edge
(or edgeness) image to closed boundaries often requires the removal of edges that are
caused by noise or other artifacts, the bridging of gaps at locations where no edge was
detected (but there should logically be one) and intelligent decisions to connect those edge
parts that make up a single object.
A watershed segmentation technique is one technique which can be used to process the
edge image. In this work, we have used 6 segmentation techniques. These are Otsu's
method, K-mean algorithm, Quad-tree approach, Delta-E algorithm and FTH method.
What is segmentation useful for
After a successful segmenting the image, the contours of objects can be extracted using edge
detection and border techniques.
Shape of objects can be described.
Based on shape, texture and color objects can be identified.
Step6.Image representation and description:
It is not possible to just retain the same result of segmentation to arrive at meaningful results.
Image representation is concerned with transforming the raw data arrived at after
segmentation to some suitable form (feature vectors) for further processing.
This can be achieved by two means.
Boundary representation: Focus is on the external shape. Includes edge, corner etc.
Regional representation: Focus is on the internal properties. It includes texture, shape,
texture etc.
Image description is all about feature selection. It helps in differentiation of object A from
object B in the image.
Step7.Object recognition:
Recognition is nothing but identification of someone or something or person from
previous encounters or knowledge.
Recognition helps in recognizing what is what in the image.
For an instance, in an image which has a car and bike, car will be recognized as a car
and bike as bike.
This is possible through the features present in the image and nothing else.
Step8. Knowledge base
Knowledge base helps in directing our focus to the region which has the information
instead of searching all over.
It would be searching for defects in textiles or could be to segment objects from a
satellite image.
Hence, knowledge base is all dependent on what do you want to do with the image.
Image noise
It is defined as a variation or deviation of brightness or color information in the image.
The source of image noise is mostly referred as camera sensors and associated internal
electronic components.
These components in the cameras introduce the imperfections in the image which certainly
degrades the quality of the image.
Where does image noise come from??
Insufficient lighting
Environmental conditions
Sensor temperature’
Transmission channels and
Dust factors
Types of image noise
Noise
Photo-electronic Impulse Structured
Photon Thermal Salt Peppe
r
Photo-electronic noise:
The photo electronic noise is classified as photon noise and thermal noise.
Photon noise
Photon noise also called shot noise or Poisson noise.
This is a noise connected to the uncertainty associated with the measurement of light.
When the number of photons sensed by the sensors from the camera is not sufficient to get
meaningful information from the scene photon noise arises.
This noise occurs mostly in poor or low lighting conditions.
Thermal noise
Thermal noise is one of the most frequently appearing noises in any electronic circuit.
Thermal noise is also called Johnson-Nyquist noise.
It is produced by random motion of the charged carriers in any conducting medium.
One can observe thermal noise in almost all electronic circuits. Thermal noise is also regarded as
white noise.
It is referred as white noise as it impacts all the frequency components of the signal equally.
Thermal noise naturally increases with temperature.
Impulse noise
Impulse noise is classified as salt and pepper noise.
Impulse noise is mainly arising due to the miss transmission of signals.
This can also be caused due to malfunctioning of the pixels in the sensors of the camera or
memory location faults in the storage.
It is also called as spike noise or independent noise.
It has got a tendency to change or modify the pixel values independently thereby creating the
damage.
Salt and pepper noise
In this type of noise the images would get the dark pixels in the bright region and bright pixels
in the dark regions.
The main source and origin of this kind of noise is through the analog to digital converter
errors.
Bit transmission errors also cause salt and pepper noise.
As an impact the image would have lot of black and white spots. The noisy pixel in this noise
type would go with either salt value or pepper value.
The salt value is grey level-255(brightest) and pepper value is grey level-0 (darkest).
Structured noise
Structured noise can be periodic-stationary or periodic non-stationary in nature.
Periodic-non stationary noise:
In this case, the noise parameters which include the amplitude, frequency and phase get varied
cross the image. This is caused by interference between the electronic components.
Periodic-stationary noise:
Here the noise parameters like amplitude, frequency and phase are fixed unlike the non-
stationary noise.
When an image gets affected by periodic noise, it appears like repeating pattern added on top
of the original image. Notch filters are used to minimise the impact of the periodic noise.
How to overcome image noise
Photon noise is totally dependent on the number of photons. Larger the number of photons
collected, lesser the noise.
Thermal noise can be reduced with careful reduction of temperature of the operation. Also, it
gets reduced with the reduction of the resistor values in the circuit.
Impulse noise can be overcome with filters like mean filters and median filters.
Applications of image processing in various fields
Agriculture: For sorting fruits, grading the fruit quality, disease identification
in the crop, weed identification.
Automobiles: Lane detection, number plate detection, toll collection system.
Industry: Fault detection, color identification of product, inspection
application.
Medicine: For diagnostic purposes, x ray or scan images can be read and
understood to see the variations and symptoms for many diseases, cancer cells
detection.
Defence: Target detection, missile guidance, survivallence (floods),
navigation.