Final Report
Final Report
Final Report
INTRODUCTION
Today, there is almost no area of technical endeavor that is not impacted in some
way by digital image processing. Medical Image processing has experienced
dramatic expansion and has been an interdisciplinary research field attracting
expertise from applied mathematics, computer science, engineering, statistics,
physics, biology, and medicine. Computer-Aided diagnostic processing has already
enhanced an important part of clinical routine, Accompanied by a rush of the new
development of high technology and use of various imaging modalities, more
challenges arise, for example how to process and analyze a significant volume of
the image so that high-quality information can be produced for disease diagnoses
and treatment.
Plant diseases affect the growth of their respective species, therefore their early
identification is very important. Many Machine Learning (ML) models have been
employed for the detection and classification of plant diseases but, after the
advancements in a subset of ML, that is, Deep Learning (DL), this area of research
appears to have great potential in terms of increased accuracy. Many
developed/modified DL architectures are implemented along with several
visualization techniques to detect and classify the symptoms of plant diseases.
Moreover, several performance metrics are used for the evaluation of these
architectures/techniques. This review provides a comprehensive explanation of DL
models used to visualize various plant diseases. In addition, some research gaps are
identified from which to obtain greater transparency for detecting diseases in
plants, even before their symptoms appear clearly
1.1.1 CHARACTERISTICS
1.1.2 Types
The two types of methods used for Image Processing are Analog and Digital Image
Processing. Analog or visual techniques of image processing can be used for the
hard copies like printouts and photographs. Image analysts use various
fundamentals of interpretation while using these visual techniques. The image
processing is not just confined to the area that has to be studied but on the
knowledge of analyst. Association is another important tool in image processing
through visual techniques. So analysts apply a combination of personal knowledge
and collateral data to image processing.
1.1.3 Phases
There are three general phases that all types of data have to undergo while using
digital technique that are pre-processing, enhancement and display.Selection of
training data is included in these phases.
1.1.4 Steps
2. Analyzing and manipulating the image which includes data compression and
image enhancement and spotting patterns that are not to human eyes like satellite
photograph.
3. Output is the last stage in which result can be altered image or report that is
based on image analysis.
1.2. Phytopathology
Tomatoes are produced commercially both in the field and in enclosed structures
(high tunnels and greenhouses). They are also a very popular garden plant among
homeowners. Tomato production, whether for commercial or personal use, is not
always an easy task. A variety of disorders, insects, diseases and pests may cause
problems during any given growing season and may damage a crop, leading to
reduced or poor-quality yields.
There are several pathogens that can cause tomato plant disease. Some tomato
disease pathogens are fungal organisms while others are bacterial or even viral.
1.2.1 Types
Bacterial Spot
The bacteria survive the winter on volunteer tomato plants and on infected plant
debris. Moist weather is conducive to disease development. Most outbreaks of the
disease can be traced back to heavy rainstorms that occurred in the area. Infection
of leaves occurs through natural openings. Infection of fruits must occur through
insect punctures or other mechanical injury.
Bacterial spot is difficult to control once it appears in the field. Any water
movement from one leaf or plant to another, such as splashing rain drops, overhead
irrigation, and touching or handling wet plants, may spread the bacteria from
diseased to healthy plants.
Prevention & Treatment: Only use certified disease-free seed and plants. Avoid
areas that were planted with peppers or tomatoes during the previous year. Avoid
overhead watering by using drip or furrow irrigation. Remove all diseased plant
material. Prune plants to promote air circulation. Spraying with a copper fungicide
will give fairly good control the bacterial disease. Follow the instructions on the
label. See Table 1 for fungicide products for home garden use.
EARLY BLIGHT
This disease is caused by the fungi Alternaria linariae and A. solani and is first
observed on the plants as small, brown lesions mostly on the older foliage. Spots
enlarge and concentric rings in a bull’s-eye pattern may be seen in the center of the
diseased area. Tissue surrounding the spots may turn yellow. If high temperature
and humidity occur at this time, much of the foliage is killed. Lesions on the stems
are similar to those on leaves and sometimes girdle the plant if they occur near the
soil line (collar rot). On the fruits, lesions attain considerable size, usually
involving nearly the entire fruit. Concentric rings are also present on the fruit.
Infected fruit frequently drops.
The fungus survives on infected debris in the soil, on seed, on volunteer tomato
plants and other solanaceous hosts, such as Irish potato, eggplant, and black
nightshade.
To reduce disease severity, test the garden soil annually and maintain a sufficient
level of potassium. Side dress tomato plants monthly with calcium nitrate for
adequate growth.
If disease is severe enough to warrant chemical control, select one of the following
fungicides: mancozeb (very good); chlorothalonil or copper fungicides (good).
Follow the directions on the label. See Table 1 for examples of fungicide products
for home garden use.
Late Blight
Late blight is a potentially serious disease of potato and tomato, caused by the
fungus Phytophthora infestans. Late blight is especially damaging during cool, wet
weather. The fungus can affect all plant parts. Young leaf lesions are small and
appear as dark, water-soaked spots. These leaf spots will quickly enlarge and a
white mold will appear at the margins of the affected area on the lower surface of
leaves. Complete defoliation (browning and shriveling of leaves and stems) can
occur within 14 days from the first symptoms. Infected tomato fruits develop
shiny, dark or olive-colored lesions, which may cover large areas. Fungal spores
are spread between plants and gardens by rain and wind. A combination of daytime
temperatures in the upper 70s °F with high humidity is ideal for infection.
Allow extra room between the plants, and avoid overhead watering,
especially late in the day.
Destroy volunteer tomato and potato plants and nightshade family weeds,
which may harbor the fungus.
Do not compost rotten, store-bought potatoes.
Plant resistant cultivars. See Table 3 for tomato cultivars with resistance to
late blight.
Leaf Mold
The fungus Fulvia fulva causes leaf mold. It is first observed on older leaves near
the soil where air movement is poor and humidity is high. The initial symptoms are
pale green or yellowish spots on the upper leaf surface, which enlarge and turn a
distinctive yellow.
Under humid conditions the spots on the lower leaf surfaces become covered with
a gray, velvety growth of the spores produced by the fungus. When infection is
severe, the spots coalesce, and the foliage is killed. Occasionally, the fungus
attacks stems, blossoms and fruits. Green and mature fruit can have a black,
leathery rot on the stem end.
The fungus survives on crop residue and in the soil. Spores are spread by rain,
wind or tools. Seeds can be contaminated. The fungus is dependent on high relative
humidity and high temperature for disease development.
Prevention & Treatment: Crop residue should be removed from the field.
Staking and pruning to increase air circulation helps to control the disease. Avoid
wetting leaves when watering. Rotate with vegetables other than tomatoes. Using a
preventative fungicide program with chlorothalonil, mancozeb or copper fungicide,
can control the disease.
Leaves stippled with yellow; leaves may appear bronzed; webbing covering leaves;
mites may be visible as tiny moving dots on the webs or underside of leaves, best
viewed using a hand lens; usually not spotted until there are visible symptoms on
the plant; leaves turn yellow and may drop from plant. Arachnid are the main cause
for this disease. Spider mites thrive in dusty conditions; water-stressed plants are
more susceptible to attack.
Prevention and Treatment
In the home garden, spraying plants with a strong jet of water can help reduce
buildup of spider mite populations; if mites become problematic apply insecticidal
soap to plants; certain chemical insecticides may actually increase mite populations
by killing off natural enemies and promoting mite reproduction.
The fungus infects all parts of plant. Infected leaves shows small, pinpoint, water
soaked spots initially. As the disease progress the spots enlarge to become necrotic
lesions with conspicuous concentric circles, dark margins and light brown centers.
Whereas the fruits exhibit brown, slightly sunken flecks in the beginning but later
the lesions become large pitted appearance.
The pathogen infects cucumber, pawpaw , ornamental plants, some weed species
etc. The damaged fruits are susceptible for this disease.
Prevention and Treatment:
Remove the plant debris and burn them. Avoid over application of nitrogen
fertilizer. If the disease is severe spray suitable fungicides.
Symptoms can occur at any growth stage and any part of the plant can be affected;
infected leaves generally exhibit a dark green mottling or mosaic; some strains of
the virus can cause yellow mottling on the leaves; young leaves may be stunted or
distorted; severely infected leaves may have raised green areas; fruit yields are
reduced in infected plants; green fruit may have yellow blotches or necrotic spots;
dark necrotic streaks may appear on the stems, petioles leaves and fruit.
ToMV is a closely related strain of Tobacco mosaic virus (TMV), it enters fields
via infected weeds, peppers or potato plants; the virus may also be transmitted to
tomato fields by grasshoppers, small mammals and birds.
Tomato yellow leaf curl virus (TYLCV) is not seed-borne, but is transmitted by
whiteflies. This disease is extremely damaging to fruit yield in both tomato and
pepper crops. Whiteflies may bring the disease into the garden from infected weeds
nearby, such as various nightshades and jimsonweed. After infection, tomato plants
may be symptomless for as long as 2 – 3 weeks.
Symptoms in tomato plants are the upward curling of leaves, yellow (chlorotic)
leaf margins, smaller leaves than normal, plant stunting, and flower drop. If tomato
plants are infected early in their growth, there may be no fruit formed. Infected
plants may appear randomly throughout the garden. Pepper plants may also
become infected, but will show no symptoms.
Prevention & Treatment: Removal of plants with initial symptoms may slow the
spread of the disease. Rogued (pulled out) infected plants should be immediately
bagged to prevent the spread of the whiteflies feeding on those plants. Keep weeds
controlled within and around the garden site, as these may be alternate hosts for
whiteflies. Reflective mulches (aluminum or silver-colored) can be used in the
rows to reduce whitefly feeding.
Low concentration sprays of a horticultural oil or canola oil will act as a whitefly
repellent, reduce feeding and possibly transmission of the virus. Use a 0.25 to 0.5%
oil spray (2 to 4 teaspoons horticultural or canola oil & a few drops of dish soap
per gallon of water) weekly. Examples of products containing horticultural oil are
Ferti-lome Horticultural Oil Spray and Bonide All Seasons Spray Oil. Example of
a product containing canola oil is Espoma Earth-tone Horticultural Oil Ready to
Spray.
At the end of the season, remove all susceptible plants and burn or dispose of them.
See Table 6 for tomato cultivars with resistance to Tomato yellow leaf curl virus.
1.4 OBJECTIVE
2 LITERATURE SURVEY
Food Image Recognition Based on Densely Connected Convolutional Neural Networks
(2020)
Convolutional neural networks have been widely used for image recognition as they are capable
of extracting features with high accuracy. In this paper, we propose a DenseFood model based
on densely connected convolutional neural network architecture, which consists of multiple
layers. A combination of softmax loss and center loss is used during the training process to
minimize the variation within the same category and maximize the variation across different
ones. For performance comparison, three models, namely, DenseFood, DenseNet121, and
ResNet50 are trained using VIREO-172 dataset. In addition, we fine tune pre-trained
DenseNet121 and ResNet50 models to extract features from the dataset. Experimental results
show that the proposed DenseFood model achieves an accuracy of 81.23% and outperforms the
other models in comparison.
A Facial Pore Aided Detection System using CNN Deep Learning Algorithm (2018)
Many people are concerned about their facial skin maintenance. Rough pore is one of the facial
skin problems which annoyed many people. The size of facial pore is tiny, and it has various
shapes. Therefore, it is difficult to recognize facial pore by using traditional image processing. In
this paper, we propose an approach based on convolutional neural networks (CNNs) to develop a
facial pore aided detection system. We use the LeNet-5 model as our benchmark architecture,
and investigate the performance of different depths network on our facial pore dataset. The facial
pore aided detection system will help people understand more about their facial skin problems
and properly keep their facial skin well.
The existing system is based on the segmentation method simple linear iterative clustering to
detect disease in plant leaves. It also shows visual attributes such as color, gradient, texture,
and shape to describe the features of leaves
3.1.1 Disadvantages
3.2 PROPOSED SYSTEM
• Machine learning is applied to detect diseases in plant leaves as it analyzes the data from
different aspects, and classifies it into one of the predefined set of classes.
• The morphological features and properties like color, intensity and dimensions of the plant
leaves are taken into consideration for classification.
• It presents an overview on various types of plant diseases and different classification
techniques in machine learning that are used for identifying leaf diseases.
• It focuses on identifying the tomato plant leaf diseases with machine learning
approach and not uses CNN as classifier . So the accuracy in prediction of plant leaf
disease is average.
3.2.1 Advantages
It is fast to process new input data.
Features automatically deduced and optimally tuned for the desired outcome.
It provides more accuracy while compared to PNN.
CHAPTER 4
SYSTEM REQUIREMENTS
4.1 REQUIREMENTS
The software requirements specification is the most important document in the software
development process. User requirements are expressed in natural language.
Anaconda Navigator(Jupyter Notebook)
Windows XP/7/8
4.1.1 Anaconda Navigator
Anaconda Navigator is a desktop graphical user interface (GUI) included in Anaconda
distribution that allows you to launch applications and easily manage conda packages,
environments, and channels without using command-line commands. Navigator can search for
packages on Anaconda Cloud or in a local Anaconda Repository. It is available for Windows,
macOS, and Linux.
In order to run, many scientific packages which despond’s on specific versions of other packages.
Data scientists often use multiple versions of many packages and use multiple environments to
separate these different versions. The command-line program conda is both a package manager
and an environment manager. This helps data scientists ensure that each version of each package
has all the dependencies it requires and works correctly. Navigator is an easy, point-and-click
way to work with packages and environments without needing to type conda commands in a
terminal window.
4.1.2 Applications
The following applications are available by default in Navigator:
JupyterLab
Jupyter Notebook
Spyder
PyCharm
VSCode
Glueviz
Orange 3 App
RStudio
Anaconda Prompt (Windows only)
Anaconda PowerShell (Windows only)
4.1.3 CONDA
CONDA as a package manager helps you find and install packages. If you need a package
that requires a different version of Python, you do not need to switch to a different environment
manager, because CONDA is also an environment manager. With just a few commands, you can
set up a totally separate environment to run that different version of Python, while continuing to
run your usual version of Python in your normal environment.
4.1.4Anaconda Cloud
You can also use Jupyter Notebooks the same way. Jupyter Notebooks are an increasingly
popular system that combines your code, descriptive text, output, images, and interactive
interfaces into a single notebook file that is edited, viewed, and used in a web browser.
4.4 Keras
Keras is a high-level neural networks API, written in Python and capable of running on top
of TensorFlow, CNTK, or Theano. It was developed with a focus on enabling fast
experimentation. Being able to go from idea to result with the least possible delay is the key to
performing good research.
Keras is an API designed for human beings, not machines. It offers consistent & simple
APIs, it minimizes the number of user actions required for common use cases, and it provides
clear and actionable feedback upon user error.
4.5 TensorFlow
TensorFlow is free and opensource softwarelibrary for dataflow and differentiable programming
across a range of tasks. It is a symbolic math library, and is also used for machine
learning applications such as neural networks.[4] It is used for both research and production
at Google.
TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive,
flexible ecosystem of tools, libraries and community resources that lets researchers push the
state-of-the-art in ML and developers easily build and deploy ML powered applications.
4.6 Quitting the Anaconda Navigator
To end your session, Open Navigator.
Select default environment.
Click 'anaconda' for updating.
Click 'apply'. ...
Click 'apply' to accept. ...
On navigator menu, click 'file>quit'
Each time NAVIGATOR quits that, for example, does not executes functions to save the
work.
RAM : 4.00 GB
4.7.2 Software
Anaconda Navigator(Jupyter Notebook)
Windows XP/7/8
Chapter 5
5.1 Architecture diagram
The proposed method consists of a CNN layer and a Pooling layer .Collecting the datasets
is one of the most tedious process as the collected dataset must be collected from an authentic
and reliable source. Data set Collection process involves collection of datasets from various
authentic sources and splitting the datasets into train and test data images. And the next step is
data preprocessing, data preprocessing is one of the data techniques that is used to transform raw
unformatted data into a desired and formatted data. The Pre-processing involves scaling of
images, reshaping of images to specific width and height. Data preprocessing involves steps like
splitting the dataset into training and testing dataset, rescaling and reshaping the images. The
training and testing datasets are used to train and test the module. some images may be of higher
pixel contrast and some image may be of lower pixel contrast The high range image tends to
create stronger loss while low range create weak loss, the sum of them will all contribute the
back propagation update so scaling of image to the same range [0 to 1] allows the images to
contribute evenly to the total loss here. The description of CNNs so far suggests a simple
architecture corresponding to a single stack of several convolutional layers. This configuration is
the most commonly implemented architecture in the computer vision literature. However, one
could imagine other architectures that might be more appropriate for the task at hand.
The next step is implementation of CNN, a deep learning algorithm that takes an image as input
and used to assign weights and bias to various aspects of the image and differentiate the images
from one other. The architecture of CNN is comparable to many aspects of the human brain, A
convolution neural network stars with an input image. The input image is broken into pixels. The
pixels are interrupted as 2D array (2*2 pixels).Random pixels are taken into account and
maximum pooling is done. The pooling layer is added after the CNN layer
The CNN model consists of single input and single output node, it contains many hidden
layers and each hidden layer consists of many neurons. Each neuron of a hidden layer is
connected to all the neurons of another hidden layer. Activation functions are added in the neural
network in order to introduce the non linear properties into the neural network. Activation
functions are necessary for the Non-linear complex functional mappings between the inputs and
response variables.
As training the neural network is an iterative process, so we have to minimize the loss
function in order to increase the accuracy of the module. The optimizers are used to update the
weight parameters in order to minimize the loss function. The CNN model is trained by changing
the activation functions and optimizers for every epoch of an image, the model is trained with
many epochs for an single imge
Large-scale automatic speech recognition is the first and most convincing successful case
of deep learning. LSTM RNNs can learn "Very Deep Learning" tasks that involve multi-second
intervals containing speech events separated by thousands of discrete time steps, where one time
step corresponds to about 10 ms. LSTM with forget gates is competitive with traditional speech
recognizers on certain tasks.
The initial success in speech recognition was based on small-scale recognition tasks based
on TIMIT. The data set contains 630 speakers from eight major dialects of American English,
where each speaker reads 10 sentences. Its small size lets many configurations be tried. More
importantly, the TIMIT task concerns phone-sequence recognition, which, unlike word-sequence
recognition, allows weak language models (without a strong grammar). This lets the weaknesses
in acoustic modeling aspects of speech recognition be more easily analyzed. The error rates
listed below, including these early results and measured as percent phone error rates (PER), have
been summarized over the past 20 years.
5.3 TECHOPEDIA EXPLAINS DEEP NEURAL NETWORK
A neural network, in general, is a technology built to simulate the activity of the human
brain – specifically, pattern recognition and the passage of input through various layers of
simulated neural connections.
Many experts define deep neural networks as networks that have an input layer, an output
layer and at least one hidden layer in between. Each layer performs specific types of sorting and
ordering in a process that some refer to as “feature hierarchy.” One of the key uses of these
sophisticated neural networks is dealing with unlabeled or unstructured data. The phrase “deep
learning” is also used to describe these deep neural networks, as deep learning represents a
specific form of machine learning where technologies using aspects of artificial intelligence seek
to classify and order information in ways that go beyond simple input/output protocols.
Neural networks are a set of algorithms, modeled loosely after the human brain, that are
designed to recognize patterns. They interpret sensory data through a kind of machine
perception, labeling or clustering raw input. The patterns they recognize are numerical,
contained in vectors, into which all real-world data, be it images, sound, text or time series, must
be translated. Neural networks help us cluster and classify. You can think of them as a clustering
and classification layer on top of data you store and manage. They help to group unlabeled data
according to similarities among the example inputs, and they classify data when they have a
labeled dataset to train on. (To be more precise, neural networks extract features that are fed to
other algorithms for clustering and classification; so you can think of deep neural networks as
components of larger machine-learning applications involving algorithms for reinforcement
learning, classification and regression.
A neural network is a system of hardware and / or software patterned after the operation of
neurons in the human brain. These are the varieties of deep learning technologies.
Feedforward
Regulatory feedback
Radial basis function (RBF)
Recurrent neural network
Modular
Physical
Convolutional
5.4.1 Feedforward
The feedforward neural network was the first and simplest type. In this network the
information moves only from the input layer directly through any hidden layers to the output
layer without cycles/loops. Feedforward networks can be constructed with various types of units,
such as binary McCulloch-Pitts neurons, the simplest of which is the perceptron. Continuous
neurons, frequently with sigmoidal activation, are used in the context of backpropagation.
The architecture of a ConvNet is analogous to that of the connectivity pattern of Neurons in the
Human Brain and was inspired by the organization of the Visual Cortex. Individual neurons
respond to stimuli only in a restricted region of the visual field known as the Receptive Field. A
collection of such fields overlap to cover the entire visual area.
In cases of extremely basic binary images, the method might show an average precision score
while performing prediction of classes but would have little to no accuracy when it comes to
complex images having pixel dependencies throughout.
Traditional machine learning relies on shallow nets, composed of one input and one
output layer, and at most one hidden layer in between. More than three layers (including input
and output) qualifies as “deep” learning. So deep is a strictly defined, technical term that means
more than one hidden layer.
In deep-learning networks, each layer of nodes trains on a distinct set of features based on
the previous layer’s output. The further you advance into the neural net, the more complex the
features your nodes can recognize, since they aggregate and recombine features from the
previous lay.
Above all, these nets are capable of discovering latent structures within unlabeled,
unstructured data, which is the vast majority of data in the world. Another word for
unstructured data is raw media; i.e. pictures, texts, video and audio recordings. Therefore, one of
the problems deep learning solves best is in processing and clustering the world’s raw, unlabeled
media, discerning similarities and anomalies in data that no human has organized in a relational
database or ever put a name to.
For example, deep learning can take a million images, and cluster them according to their
similarities: cats in one corner, ice breakers in another, and in a third all the photos of your
grandmother. This is the basis of so-called smart photo albums.
Now apply that same idea to other data types: Deep learning might cluster raw text such
as emails or news articles. Emails full of angry complaints might cluster in one corner of the
vector space, while satisfied customers, or spambot messages, might cluster in others. This is the
basis of various messaging filters, and can be used in customer-relationship management
(CRM). The same applies to voice messages. With time series, data might cluster around
normal/healthy behavior and anomalous/dangerous behavior. If the time series data is being
generated by a smart phone, it will provide insight into users’ health and habits; if it is being
generated by an auto part, it might be used to prevent catastrophic breakdowns.
When training on unlabeled data, each node layer in a deep network learns features
automatically by repeatedly trying to reconstruct the input from which it draws its samples,
attempting to minimize the difference between the network’s guesses and the probability
distribution of the input data itself. Restricted Boltzmann machines, for examples, create so-
called reconstructions in this manner.
In the process, these networks learn to recognize correlations between certain relevant
features and optimal results – they draw connections between feature signals and what those
features represent, whether it is a full reconstruction, or with labeled data.
A deep-learning network trained on labeled data can then be applied to unstructured data,
giving it access to much more input than machine-learning nets. This is a recipe for higher
performance: the more data a net can train on, the more accurate it is likely to be. (Bad
algorithms trained on lots of data can outperform good algorithms trained on very little.) Deep
learning’s ability to process and learn from huge quantities of unlabeled data give it a distinct
advantage over previous algorithms. Figure 5.3 illustrates successive model layers learn deeper
intermediate representation.
A deep fully-connected neural network with an I’d prior over its parameters is
equivalent to a Gaussian process (GP) in the limit of infinite network width. This
correspondence enables exact Bayesian inference for neural networks on regression tasks by
means of straightforward matrix computations. For single hidden-layer networks, the covariance
function of this GP has long been known. Recently, kernel functions for multi-layer random
neural networks have been developed, but only outside of a Bayesian framework.
As such, previous work has not identified the correspondence between using these
kernels as the covariance function for a GP and performing fully Bayesian prediction with a
deep neural network. In this work, we derive this correspondence and develop a computationally
efficient pipeline to compute the covariance functions. We then use the resulting GP to perform
Bayesian inference for deep neural networks on MNIST (Modified National Institute of Standard
and Technology Database) and CIFAR-10. We find that the GP-based predictions are
competitive and can outperform neural networks trained with stochastic gradient descent. We
observe that the trained neural network accuracy approaches that of the corresponding GP-based
computation with increasing layer width, and that the GP uncertainty is strongly correlated with
prediction error. We connect our observations to the recent development of signal propagation in
random neural networks. This correspondence enables exact Bayesian inference for neural
networks on regression tasks by means of straightforward matrix computations.
5.6 Pooling
Similar to the Convolutional Layer, the Pooling layer is responsible for reducing the
spatial size of the Convolved Feature. This is to decrease the computational power required to
process the data through dimensionality reduction. Furthermore, it is useful for extracting
dominant features which are rotational and positional invariant, thus maintaining the process of
effectively training of the model.
There are two types of Pooling: Max Pooling and Average Pooling. Max
Pooling returns the maximum value from the portion of the image covered by the Kernel. On the
other hand, Average Pooling returns the average of all the values from the portion of the image
covered by the Kernel.
Max Pooling also performs as a Noise Suppressant. It discards the noisy activations
altogether and also performs de-noising along with dimensionality reduction. On the other hand,
Average Pooling simply performs dimensionality reduction as a noise suppressing mechanism.
Hence, we can say that Max Pooling performs a lot better than Average Pooling.
The Convolutional Layer and the Pooling Layer, together form the i-th layer of a Convolutional
Neural Network. Depending on the complexities in the images, the number of such layers may be
increased for capturing low-levels details even further, but at the cost of more computational
power.
CHAPTER 6
MODULES DESCRIPTION
6.1 MODULES
A module is a separate unit of software or hardware. Typical characteristic of modular
components include portability, which allows them to be used in variety of systems, and
interoperability, which allows them to function with the components of other systems. The
modules used in the detecting Skin cancer are,
Preprocessing
Filtering
Histogram Equalization
Anisotropic diffusion
Random Walk Segmentation
Feature extraction
Classification
Post processing
6.1.1 Preprocessing
In signal processing, it is often desirable to be able to perform some kind of noise
reduction on an image or signal. The median filter is a nonlinear digital filteringtechnique, often
used to remove noise. Such noise reduction is a typical pre-processing step to improve the results
of later processing (for example, edge detectionon an image). Median filtering is very widely
used in digital image processing because, under certain conditions, it preserves edges while
removing noise.
6.1.2 Filtering
Median filtering follows this basic prescription. The median filter is normally used to reduce
noise in an image, somewhat like the mean filter. Fig 6. 1 illustrates an example of median
filtering
Fig 6.1 Example of median filter
Like the mean filter, the median filter considers each pixel in the image in turn and looks at its
nearby neighbors to decide whether or not it is representative of its surroundings. Instead of
simply replacing the pixel value with the mean of neighboring pixel values, it replaces it with the
median of those values. The median is calculated by first sorting all the pixel values from the
surrounding neighborhood into numerical order and then replacing the pixel being considered
with the middle pixel value. (If the neighborhood under consideration contains an even number
of pixels, the average of the two middle pixel values is used. Fig 6.2 illustrates an example
calculation
Fig 6.2 Calculation of pixel values
Fig 6.2 Calculating the median value of a pixel neighborhood. As can be seen, the central pixel
value of 150 is rather unrepresentative of the surrounding pixels and is replaced with the median
value: 124. A 3×3 square neighborhood is used here --- larger neighborhoods will produce more
severe smoothing.
6.1.3 Histogram Equalization
Histogram equalization is a method in image processing of contrast adjustment using
the image's histogram.This method usually increases the global contrast of many images,
especially when the usable data of the image is represented by close contrast values. Through
this adjustment, the intensities can be better distributed on the histogram. This allows for areas
of lower local contrast to gain a higher contrast. Histogram equalization accomplishes this by
effectively spreading out the most frequent intensity values.
Histogram equalization is a specific case of the more general class of histogram
remapping methods. These methods seek to adjust the image shows in Fig 6.3 to make it
easier to analyze or improve visual quality (e.g., retinex)
6.1.3.1Back projection
The back projection (or "project") of a histogram image is the re-application of the
modified histogram to the original image, functioning as a look-up table for pixel brightness
values.
For each group of pixels taken from the same position from all input single-channel
images, the function puts the histogram bin value to the destination image, where the coordinates
of the bin are determined by the values of pixels in this input group. In terms of statistics, the
value of each output image pixel characterizes the probability that the corresponding input pixel
group belongs to the object whose histogram is used.
6.1.3.2 Histogram equalization of color images
However it can also be used on color images by applying the same method separately to
the Red, Green and Blue components of the RGB color values of the image. However, applying
the same method on the Red, Green, and Blue components of an RGB image Fig 6.4 defines the
black and red images may yield dramatic changes in the image's color balance since the relative
distributions of the color channels change as a result of applying the algorithm. However, if the
image is first converted to another color space, Lab color space, or HSL/HSV color space in
particular, then the algorithm can be applied to the luminance or value channel without resulting
in changes to the hue and saturation of the image. There are several histogram equalization
methods in 3D space. Trahanias and Venetsanopoulos applied histogram equalization in 3D
color space However, it results in "whitening" Fig 6.5 explains the histogram where the
probability of bright pixels are higher than that of dark ones. Han et al. proposed to use a new
CDF defined by the Iso-luminance plane, which results in uniform gray distribution.
CHAPTER 7
CONCLUSION
The model obtained an accuracy of 80.2% will using the Relu ,Tanh and Sigmoid
activation functions together while the model gave about 50 percent accuracy while using only
Relu activation function and less than 50 percent while using Tanh function and the highest
accuracy was obtained by using these activation functions together.The model was able to
predict more accuratelywhile combining the functions then using these functions separately.
APPENDICES
SAMPLE CODING
if K.image_data_format() == 'channels_first':
input_shape = (3, img_width, img_height)
else:
input_shape = (img_width, img_height, 3)
model = Sequential()
model.add(Conv2D(32, (2, 2), input_shape = input_shape))
model.add(Activation('tanh'))
model.add(MaxPooling2D(pool_size =(2, 2)))
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.1))
model.add(Flatten())
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.summary()
model.compile(loss ='binary_crossentropy',
optimizer ='adam',
metrics =['accuracy'])
train_datagen = ImageDataGenerator(
rescale = 1. / 255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
test_datagen = ImageDataGenerator(rescale = 1. / 255)
train_generator = train_datagen.flow_from_directory(train_data_dir,
target_size =(img_width, img_height),
batch_size = batch_size, class_mode ='binary')
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size =(img_width, img_height),
batch_size = batch_size, class_mode ='binary')
history = model.fit_generator(train_generator,
steps_per_epoch = nb_train_samples // batch_size,
epochs = epochs, validation_data = validation_generator,
validation_steps = nb_validation_samples // batch_size,verbose=1,callbacks=[early_stopping_
monitor, model_checkpoint])
# Accuracy Graph
import matplotlib.pyplot as plt
plt.figure()
plt.plot(history.history['acc'], color='green')
plt.plot(history.history['val_acc'], color='blue')
plt.title('Accuracy ', pad=25)
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend(['Train', 'Test'], loc='lower right')
plt.show()
# Accuracy Graph
import matplotlib.pyplot as plt
plt.figure()
plt.plot(history.history['loss'], color='green')
plt.plot(history.history['val_loss'], color='blue')
plt.title('Accuracy ', pad=25)
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend(['Train', 'Test'], loc='lower right')
plt.show()
obj = load_model('model.h5')
obj.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
img = cv2.imread('train/diseases/fifth.jpg')
img = cv2.resize(img,(150,150))
img = np.reshape(img,[1,150,150,3])
classes = obj.predict_classes(img)
print(classes)
obj = load_model('model.h5')
obj.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
img = cv2.imread('test/cancer/cancer6.jpg')
img = cv2.resize(img,(150,150))
img = np.reshape(img,[1,150,150,3])
classes = obj.predict_classes(img)
print(classes)
SAMPLE OUTPUT: