Potato Disease Classi Fication Using Convolution Neural Networks

advances in
Advances in Animal Biosciences: Precision Agriculture (ECPA) 2017, (2017), 8:2, pp 244–249 © The Animal Consortium 2017 animal
doi:10.1017/S2040470017001376 biosciences
Potato Disease Classification Using Convolution Neural Networks

D. Oppenheim1† and G. Shani2
1
Department of Industrial Engineering and Management, Ben-Gurion University of the Negev, Beer Sheva, Israel; 2Department of Software and Information Systems
Engineering, Ben-Gurion University of the Negev, Beer Sheva, Israel
Many plant diseases have distinct visual symptoms which can be used to identify and classify them correctly. This paper presents a
potato disease classification algorithm which leverages these distinct appearances and the recent advances in computer vision
made possible by deep learning. The algorithm uses a deep convolutional neural network training it to classify the tubers into five
classes, four diseases classes and a healthy potato class. The database of images used in this study, containing potatoes of
different shapes, sizes and diseases, was acquired, classified, and labelled manually by experts. The models were trained over
different train-test splits to better understand the amount of image data needed to apply deep learning for such
classification tasks.
Keywords: Plant disease detection and classification, Computer vision, Convolutional neural network, Potato diseases
Introduction improving the classification of the ImageNet database by more

than 10%. They achieved a top-5 error rate of 15.3% when
Potato (Solanum tuberosum) is the third most important food
using a deep Convolution Neural Network (CNN), while the
crop in the world, after cereals and rice. Global production
second best achieved 26.2% error rate (Krizhevsky, Sutskever, &
exceeds 300 million metric tons and is an important nutrition and
Hinton, 2012). Since then, CNN methods have improved and
calorie provider for humanity (Pareek 2016). Potato production is
recently the classification error dropped to 3.73% by the winning
threatened by several diseases resulting in considerable yield
team for the same task (Abdi and Nahavandi 2016). In the field
losses, and causing decrease in the quality and increase in the
of computer vision for agricultural applications, the use of CNNs
price of potatoes (Taylor et al., 2008). An early disease detection
and other deep neural networks is continuously increasing
system can aid in avoiding such cases. Moreover, it can improve
(Gongal et al., 2015). A CNN was recently used for detecting and
the management of the crop and can further prevent the spread
classifying seven fruits in field conditions, improving detection
of diseases (Rich 2013). Manually detecting and sorting potatoes
accuracy by 3% from the last state of the art (Sa et al., 2016).
is difficult, costly, and time consuming, while computerized
CNNs used in classification tasks, such as disease classification
inspection may be more efficient and cost effective.
of plant leaves or quality control of harvested fruit and vege-
Computer vision and machine learning techniques for
table, reached accuracy of more than 97% (Mohanty et al.,
disease detection have been broadly researched in the last
2016; Tan et al., 2015). In order to create successful CNNs, a
two decades (Garcia and Barbedo 2016). Diseases can be
large amount of training data is needed (Sermanet et al., 2013).
detected using expensive and bulky digital imaging sensors,
Therefore, the first aim of the current research was the collection
such as spectral or near-infrared sensors. Using such sensors
of a sufficient dataset and classification of the displayed
encumbers the widespread implementation of these methods
diseases. Results indicate a first step towards multiple disease
due to its high costs and maintenance (Sankaran et al.,
classification for potatoes using CNN.
2010). On the other hand, researchers using the visible light
bandwidth, which can be captured by relatively low cost
cameras, have usually focused on a single type of disease
(Zhang et al., 2014). A single case identification is insuffi- Materials and Methods
cient for real-world applications, as a single tuber can be Data acquisition
infected a number of diseases (Cubero et al., 2016). Photos of 400 contaminated potatoes of different shapes,
This paper leverages recent advances in computer vision sizes and tones were acquired under normal uncontrolled illu-
and object recognition, for classifying multiple diseases in pota- mination conditions. The tubers were manually classified by
toes. In 2012 a group of researchers from Toronto won the Large experts as a standard procedure of statistically estimating the
Scale Visual Recognition Challenge (ILSVRC) competition by rate of various diseases in seed potato tubers prior to planting
them in the fields. This procedure is done annually independent
†
E-mail: [email protected] of the current research. The potatoes were contaminated with
244
Downloaded from https:/www.cambridge.org/core. Ben Gurion University of the Negev Library, on 12 Jul 2017 at 19:42:12, subject to the Cambridge Core terms of use, available at
https:/www.cambridge.org/core/terms. https://doi.org/10.1017/S2040470017001376
Figure 1 Examples of visual symptoms on potato diseases: (a) Black Scurf disease - irregular, black, scab-like marks on the skin of the tuber. (b) Silver Scurf
disease - circular or irregular, tan to silvery gray lesions on the tuber’s skin. (c) Common Scab disease - circular brown rough areas, with irregular margins which
can coalesce into larger areas. (d) Black Dot disease - tiny black dots on the skin of the tuber (magnified in top left corner). (5) Uninfected tuber.
four different diseases, all with significant visual symptoms on and uninfected potatoes. As manually labelling diseased
the tuber’s skin (see Figure 1). The images were acquired using patches of potatoes is a tedious and time costly task,
multiple types of standard cameras, captured from one an important task was to determine the minimal amount of
viewpoint only. The cameras used were Sony DSC-T200, the training data that provides sufficient classification accuracy.
Apple iPhone 4 camera, and the Samsung Galaxy S3 camera. The CNN was trained with different sizes of training sets. The
smallest training set used for training was 10% of the 2,465
Data preparation images, incrementally increasing by 10% to 90% of the
The images acquired were used to create the training and tests whole dataset as detailed in Table 1. In each increment the
sets for the CNN. Every visual symptom of a disease was images were selected uniformly from the whole dataset.
marked and labelled using the image labeler application in Testing of the algorithm was done on the remaining data. In
MatLab 2014b. The labelling was done with rectangular total the training and testing phases was repeated 9 times
bounding boxes encompassing the visual symptom but also over different training set sizes.
much regular potato skin, as seen in Figure 2. The marked Each training set was trained for 90 epochs, where one epoch
areas were cropped from the original image, transformed into is defined as a one full training cycle on every sample in
grayscale, and resized to a standard 224 × 224 pixel square. the training set. The choice of limiting to 90 epochs was made
After preprocessing, a total of 2,465 patches of diseased based on empirical observations that revealed that the learning
potatoes was gathered including: 265 Black Dot patches, 469 converged well within 90 epochs (as can be seen in Figure 4).
Black Scurf patches, 686 Common Scab patches, 738 Silver In order to compare between the different results over the
Scurf patches and 307 uninfected patches. 9 training sets, the error rate of the best scoring guess was
calculated as the number of errors divided by the total number
Performance Measurement of test images in every epoch. The error was calculated both for
The experiment was designed to evaluate the performance of test and train sets, in order to understand the over\under fitting
the CNN’s learning algorithm in classifying four diseases of the procedure.
245
Oppenheim and Shani
Figure 2 Examples of diseased potato patches before and after the transformation to grayscale. (a)–(c), (g) and (h) are the original RGB images from each
class. (d)-(f), (i) and (j) are the same images after the conversion to grayscale, using Matlab’s rgb2gray function.
Algorithm fully-connected layer (fc3) producing a distribution of values,

The algorithm chosen for the image classification task was one for each class. The sum of these values add up to 1 and
a deep convolutional neural network (CNN). The basic they represent the probability of the input image to belong
architecture chosen for this problem was a CNN developed to one of the five classes. This softmax layer was also
by the Visual Geometry Group (VGG) from the University of altered and adapted, reducing its size from 1,000 to 5 to fit our
Oxford named CNN-F due to its faster training time (Chatfield classification task.
et al., 2014). Several new dropout layers were added to the The hyper-parameters used in each training experiment
VGG architecture to deal with problems of over fitting, were:
especially due to the relatively small dataset.
The required input image size for this network is a 224 × 224 ∙ Solver type: Stochastic Gradient Descent
matrix. The CNN comprises 8 learnable layers, the first ∙ Learning rate: 0.0001
5 of which are convolutional, followed by 3 fully-connected ∙ Batch size: 50
layers and ending with a softmax layer (see Figure 3). The ∙ Momentum: 0.9
softmax layer normalizes the input received from the last ∙ Weight decay: 0.0005
246
# Uninfected Training CNNs usually requires a large amount of labelled

tested data in order to perform a good classification. Therefore, two
30
60
90
120
150
180
210
240
270
methods were used for data augmentation; Mirroring creates
additional examples by flipping the images used in training
randomly. As the direction of the photos was arbitrary, mirror-
ing the image horizontally does not change the correctness
# Uninfected
of the data; Cropping was also used, cropping the image

trained
97
67
37
277
247
217
187
157
127 randomly to different sizes, while keeping the cropped image
minimum size to 190 × 190, can achieve data diversity. The use
of each data augmentation method was done randomly. Before
each image was inserted into the net for training it was mirrored,
# Black Dot
tested
cropped or inserted without altering in equal distributions.

26
52
78
104
130
156
182
208
234
Therefore, two thirds of the images trained were altered.

# Black Dot
trained
Results and discussion

239
213
187
161
135
109
83
57
31
Results indicate, as expected, that using more data for the

training phase improves the classification and reduces the
error rate (see Figure 4). The best trained model (trained
# Silver Scurf
on 90% of the dataset and tested on the remaining 10%)

tested
73
146
219
292
365
438
511
584
657
classified correctly 96% of the images. Results indicate that

for 8 out of the 9 training sets, accuracy does not drop below
90% as the training set size decreases (Figure 4); the average
difference of error rates between the best training set (90%
# Silver Scurf
train-10% test) and the worst training set (20% train-80%

trained
81
665
592
519
446
373
300
227
154
test) in these 8 sets was 5.73%. There is a significant drop in

performance when the CNN was trained on 10% of the
dataset and tested on 90% of it. Correct classification for this
# Common Scab
training set decreased to 83% as opposed to 90% of the

classifier obtained with 20% train and 80% test. The rela-
tested
68
136
204
272
340
408
476
544
612
tively small decrease in accuracy (Table 2) for most training

set sizes, is an indicator that a small amount of potato ima-
ges could suffice for training a sufficiently accurate CNN.
In order to further evaluate the CNN’s classification a
# Common Scab
confusion matrix was calculated. The confusion matrix’s

trained
columns represent the CNN’s class classification while the

618
550
482
414
346
278
210
142
307
74
rows represent the actual classes. This type of representation

can help evaluate the CNN’s classification of each class.
Figure 5 shows a confusion matrix of the best performing
CNN, trained on 90% of the dataset and tested on 10%. The
# Black Scurf
confusion matrix shows that the CNN classified correctly and

tested
46
92
138
184
230
276
322
368
414
265
with high accuracy infected potato tubers; 100% of the tubers

which were infected with Black Dot and Black Scurf were
classified correctly; over 92% of the Silver Scurf and Common
Scab infected tubers were classified correctly as well. The
# Black Scurf
CNN’s performance dropped when classifying uninfected

trained
Table 1 Train and test set division.
55
423
377
331
285
239
193
147
101
738
tubers. Most of the CNN’s misclassifications occurred when

classifying uninfected tubers to the disease class – Silver Scurf.
Silver Scurf’s visual symptom are bright tan to silvery gray
lesions on the tuber’s skin that resemble uninfected skin.
Test
10
20
30
40
50
60
70
80
90
686
%
These results indicate that the trained CNN can classify

correctly and accurately the four diseases presented here.
Train
90
80
70
60
50
40
30
20
10
469
%
However, uninfected tubers were harder to classify. The fact

that the diseases were classified with high accuracy makes it
Training
suitable for a system which identification of the disease is

Total
Set
important. Most misclassifications occurred for the uninfected

1
2
3
4
5
6
7
8
9
247
Oppenheim and Shani
Figure 3 A simplified model of the CNN used.
Figure 4 Results of the experiment. Top plot shows the accuracy of each training set (colored and dotted graphs) according to the epoch number. The
bottom plot show the smoothed results.
Table 2 Best performing model accuracy results for each train-test set
Train- Test Set Division 90%–10% 80%–20% 70%–30% 60%–40% 50%–50% 40%–60% 30%–70% 20%–80% 10%–90%
Accuracy 0.9585 0.9567 0.9465 0.9454 0.9069 0.9183 0.9041 0.9012 0.8321
class, for practical use these mistakes have less affect since from 83% for the model trained on the least amount of data,
planting infected tubers can spread the disease and cause to 96%, when the model was trained on 90% of the data.
considerable damage while misclassifying uninfected tubers To obtain classification rates higher than 90% it is sufficient
can be solved. In this experiment only 307 images of to use 20% of the images (i.e., 493 images).
uninfected tubers were used, increasing the amount of data These results further show that combining the CNN intro-
of uninfected tubers can increase classification accuracy. duced here with a sliding window algorithm could be utilized
for classifying full images of potatoes to different diseases
with little labelling work beforehand. Ongoing research is
Conclusions and future work aimed to develop a classification algorithm with an expanded
number of disease classes. Acquiring data can be done
The applicability of a convolution neural network in easily since there are no constraints on the data acquisition.
classifying image patches of diseased potatoes into four
disease classes and a uninfected class was examined.
The 2,465 images classified by the trained CNN model varied Acknowledgements
in the acquisition device and conditions. Results indicate This work is supported by the Ministry of Agriculture and
the robustness of the classification algorithm allowing partially supported by the Helmsley Charitable Trust through the
for uncontrolled acquisition conditions. Results reveal that Agricultural, Biological and Cognitive Robotics Initiative and the
the correct classification of fully trained CNN models ranges Rabbi W. Gunther Plaut Chair in Manufacturing Engineering,
248
Garcia J and Barbedo A 2016. A Review on the Main Challenges in Automatic

Plant Disease Identification Based on Visible Range Images. Biosystems
Engineering 144, 52–60.
Gongal A, Amatya S, Karkee M, Zhang Q and Lewis K 2015. Sensors and Systems
for Fruit Detection and Localization: A Review. Computers and Electronics in
Agriculture 116, 8–19.
Krizhevsky A, Sutskever I and Hinton GE 2012. ImageNet Classification
with Deep Convolutional Neural Networks. Advances In Neural Information
Processing Systems 1–9.
Mohanty SP, Hughes D and Salathé M 2016. Using Deep Learning for Image-
Based Plant Disease Detection. Frontiers in Plant Science 7.
Pareek S 2016. Postharvest Ripening Physiology of Crops. CRC Press. Taylor and
Francis, FL, USA.
Rich AE 2013. Potato Diseases. Elsevier Science. Academic Press, New York, USA.
Sa I, Ge Z, Dayoub F, Upcroft B, Perez T and McCool C 2016.
Figure 5 A confusion matrix of the CNN trained on 90% of the dataset DeepFruits: A Fruit Detection System Using Deep Neural Networks. Sensors 16
and tested on the remaining 10%. Rows represent the actual classes of (8), 1222.
an image. Columns represent the CNN’s class prediction. Each cell in the
Sankaran S, Mishra A, Ehsani R and Davis C 2010. A Review of Advanced
matrix represent the percentage of images of the row’s class that were Techniques for Detecting Plant Diseases. Computers and Electronics in
classified to the column’s class. Agriculture 72 (1), 1–13.
Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R and LeCun Y 2013. OverFeat:
both at Ben-Gurion University of the Negev. We thank Professor Integrated Recognition, Localization and Detection Using Convolutional Networks.
https://arxiv.org/abs/1312.6229.
Yael Edan for her important comments.
Tan W, Zhao C and Wu H 2015. Intelligent Alerting for Fruit-Melon Lesion
Image Based on Momentum Deep Learning. Multimedia Tools and Applications
75 (24), 16741–16761.
References
Taylor RJ, Pasche JS and Gudmestad NC 2008. Susceptibility of Eight
Masoud A and Nahavandi S 2016. Multi-Residual Networks: Improving the Potato Cultivars to Tuber Infection by Phytophthora Erythroseptica and Pythium
Speed and Accuracy of Residual Networks, https://arxiv.org/abs/1609.05672. Ultimum and Its Relationship to Mefenoxam-Mediated Control of Pink Rot
Chatfield K, Simonyan K, Vedaldi A and Zisserman A 2014. Return of the Devil in the and Leak. Annals of Applied Biology 152 (2), 189–199.
Details: Delving Deep into Convolutional Nets. https://arxiv.org/abs/1405.3531. Zhang B, Huang W, Jiangbo L, Zhao C, Fan S, Wu J and Liu C 2014. Principles,
Cubero S, Lee SW, Aleixos N, Albert F and Blasco J 2016. Automated Developments and Applications of Computer Vision for External Quality
Systems Based on Machine Vision for Inspecting Citrus Fruits from the Field to Inspection of Fruits and Vegetables: A Review. Food Research International 62,
Postharvest—a Review. Food and Bioprocess Technology 1–17. 326–343.
249

Potato Disease Classi Fication Using Convolution Neural Networks

Uploaded by

Copyright:

Available Formats

Potato Disease Classi Fication Using Convolution Neural Networks

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Potato Disease Classi Fication Using Convolution Neural Networks

Uploaded by

Copyright:

Available Formats

advances in

Potato Disease Classiﬁcation Using Convolution Neural Networks

Introduction improving the classiﬁcation of the ImageNet database by more

Algorithm fully-connected layer (fc3) producing a distribution of values,

# Uninfected Training CNNs usually requires a large amount of labelled

of the data; Cropping was also used, cropping the image

cropped or inserted without altering in equal distributions.

Therefore, two thirds of the images trained were altered.

Results and discussion

Results indicate, as expected, that using more data for the

on 90% of the dataset and tested on the remaining 10%)

classiﬁed correctly 96% of the images. Results indicate that

train-10% test) and the worst training set (20% train-80%

test) in these 8 sets was 5.73%. There is a signiﬁcant drop in

training set decreased to 83% as opposed to 90% of the

tively small decrease in accuracy (Table 2) for most training

confusion matrix was calculated. The confusion matrix’s

columns represent the CNN’s class classiﬁcation while the

rows represent the actual classes. This type of representation

confusion matrix shows that the CNN classiﬁed correctly and

with high accuracy infected potato tubers; 100% of the tubers

CNN’s performance dropped when classifying uninfected

tubers. Most of the CNN’s misclassiﬁcations occurred when

These results indicate that the trained CNN can classify

However, uninfected tubers were harder to classify. The fact

suitable for a system which identiﬁcation of the disease is

important. Most misclassiﬁcations occurred for the uninfected

Figure 3 A simpliﬁed model of the CNN used.

Garcia J and Barbedo A 2016. A Review on the Main Challenges in Automatic

You might also like