Image Processing of Big Data For Plant Diseases of
Image Processing of Big Data For Plant Diseases of
ABSTRACT
In this research, plant pathogens are considered as big data because of the numerical counts for
high intensity pixels in the images. The research presents an automated approach for early detection
of plant diseases using image processing techniques. By analyzing the color features of leaf areas,
the k-means algorithm for color segmentation and the Gray-Level Co-Occurrence Matrix (GLCM)
are used for disease classification. A novelty of this research is that it illustrates four categories of
plants to analyze and compare: (1.) Grain, represented by Rice Plant Leaf Data; (2.) Fruit, represented
by banana plant leaf data, (3.) Flower, represented by sunflower plant leaf data; and (4.) Vegetable,
represented by potato plant leaf data. Six stages of image processing are applied to real data for
diseases of leaf smut for rice, black sigatoka for banana, leaf scars for sunflower, and late blight for
potato. Finally, a comparison of the image processing for each of the four plant types, conclusions,
and future research directions are presented.
KEYWORDS
AlexNet, Artificial Intelligence, Black Sigatoka, Deep Learning, GoogLeNet, Gray-Level Co-Occurrence Matrix
(GLCM), Image Processing, Late Blight, Leaf Scars, Leaf Smut, Python
1. INTRODUCTION
DOI: 10.4018/IJCVIP.353913
This article published as an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creative-
commons.org/licenses/by/4.0/) which permits unrestricted use, distribution, and production in any medium, provided the author of the
original work and original publication source are properly credited.
1
International Journal of Computer Vision and Image Processing
Volume 14 • Issue 1 • January-December 2024
shown in Table 10 for each of the images in the examples presented in this article clearly classify
this study as using big data for the image processing.
The research problem is to investigate the effectiveness and accuracy of image processing with
the use of deep-learning architecture for disease identification of four different types of plants with
the framework provided by using GoogLeNet and AlexNet. The steps of model evaluation using
GoogLeNet and AlexNet are shown in Appendix A.
The categories of plants are grain, fruit, flower and vegetable as shown in figure 1 that is a visual
aid to understand the overall structure of the novelty of the research performed. Using a public dataset
of 24,449 files borrowed from Kaggle in Plant Village, Rice Leaf Disease Detection, Sunflower Fruit
and Leaves and Banana Disease Recognition Dataset, we then use image processing techniques for
plant disease identification of these four different categories of plants.
For our research paper on plant disease identification, we leveraged datasets from Kaggle.com.
Below is a detailed explanation of how we selected and utilized these datasets:
2
International Journal of Computer Vision and Image Processing
Volume 14 • Issue 1 • January-December 2024
1. Vegetables:
Plant Village Dataset: This dataset is extensive, containing 20,600 files. We focused on the potato
late blight disease category, which had 1,000 files. The large size and diversity of this dataset
made it ideal for training and validating our model on vegetable diseases.
2. Grains:
Rice Leaf Disease Dataset: This dataset had 120 files, with 40 specifically for rice leaf smut
disease. Despite its smaller size, it provided crucial data for evaluating our model's
performance on grain diseases.
3. Flowers:
Sunflower Fruit and Leaves Dataset: Containing 465 files, we used this dataset to study leaf
scar disease, which had 140 files. This dataset helped us test the model's accuracy on
flower-related diseases.
4. Fruits:
Banana Disease Recognition Dataset: With 3,264 files, this dataset focused on black sigatoka
disease, represented by 67 files. Given the importance of bananas as a fruit crop, this dataset
was vital for our fruit disease identification study.
1.3 Implementation
1. Data Preprocessing: Each dataset was preprocessed to ensure uniformity. This included resizing
images, normalizing pixel values, and data augmentation to increase the sample size for training.
2. Model Training and Testing: The datasets were divided into training and testing sets. We employed
machine learning algorithms to train models on the training data and evaluated their performance
on the test data.
3. Evaluation Metrics: Metrics such as accuracy, precision, recall, and F1-score were used to assess
the models' effectiveness in identifying plant diseases across different categories.
By using these Kaggle datasets, we were able to conduct a comprehensive study on plant disease
identification, leveraging diverse and high-quality data to train and validate our models effectively.
The authors presented six stages of image processing and then specifically applied them to real
data for diseases of leaf smut for rice, black sigatoka for banana, leaf scars for sunflower, and late blight
for potato. The plant image data sets used in this paper were obtained from kaggle.com repositories
having web addresses as provided in Appendix. Finally, a comparison of the image processing results
is made for each of the four plant types, conclusions, and future research directions are presented.
3
International Journal of Computer Vision and Image Processing
Volume 14 • Issue 1 • January-December 2024
Bharate and Shirdhonkar (2017) and Hasan et al. (2022) presented a review on plant disease
detection using image processing. Image processing can be considered as big data, when considering
the number of pixels in each of the images, that can be partitioned into subareas of healthy pixels
and diseased pixels.
4
International Journal of Computer Vision and Image Processing
Volume 14 • Issue 1 • January-December 2024
Figure 2 offers a clear and concise overview of our image processing methodology for analyzing
and classifying diseased leaf images. It starts with capturing the original image and then systematically
moves through essential steps, such as noise reduction and contrast enhancement, to prepare the
image for detailed examination.
Next, the process involves segmenting the image to differentiate between healthy and diseased
areas. We then extract key features from the diseased spots and classify the results, using color-coding to
make the distinctions easy to understand. This step-by-step approach ensures we can accurately assess
the extent of the disease and visualize its impact on the plants. The flow chart in Figure 2 captures
this methodology, illustrating each vital stage in our process in a straightforward and accessible way.
Workflow of Steps 1 to 6
The following crucial steps are involved in image processing for the identification of plant diseases:
More in detail, first, digital cameras or smartphones are used to take high-quality pictures of
plant leaves. Preprocessing is applied to these photos to improve their quality, such as contrast and
noise reduction. Using methods such as thresholding or k-means clustering, the leaf is separated from
5
International Journal of Computer Vision and Image Processing
Volume 14 • Issue 1 • January-December 2024
Table 1. Authors who have used image processing for plant visualization
the background in the following step, that is, image segmentation. The process of extracting color,
texture, and shape features from the segmented images comes next. Techniques such as the GLCM
for texture and statistical analysis of color channels are frequently employed.
Following the extraction of features, the features are classified and the presence and type of
disease are identified using deep learning techniques (e.g., CNNs) or machine learning algorithms
(e.g., SVM). Accurate diagnosis is made possible by this classification process, which associates
the features with recognized disease categories. The last stage is disease identification, in which the
plant’s specific disease is identified and potential treatments are suggested using the classification
6
International Journal of Computer Vision and Image Processing
Volume 14 • Issue 1 • January-December 2024
Table 2. Authors who have performed image processing with artificial intelligence (AI) for plant visualization
Figure 2. Flow chart for leaf disease recognition (Fulari et al., 2020)
results. This methodical approach improves plant disease early detection and management, which
may lower agricultural losses and boost crop quality.
Table 3 provides a sample Python function used for each of the steps of Figure 1 above for leaf
detection recognition.
In this research, we performed image processing for selected categories of grain, fruit, flower, and
vegetable and for a representative plant for each of these categories, namely, rice, banana, sunflower,
and potato, respectively, as we presented in the following sections.
A staple food for more than half of the world’s population, especially in Asia, is rice (Oryza
sativa). Rice is a member of the Poaceae grass family and is grown mostly in warm, tropical, and
subtropical climates. Indica and Japonica are its two primary subspecies. For many, the grain is a
7
International Journal of Computer Vision and Image Processing
Volume 14 • Issue 1 • January-December 2024
Figure 3. Image processing steps as shown for two different original images of rice leaves
Note. Step 1: original; step 2a: noise reduction; step 2b: contrast enhancement; step 3: improved
segmentation; step 4: feature extraction; step 5: classification. Source of Original images (Step 1):
(Kaggle (2020). https://www.kaggle.com/datasets/vbookshelf/rice-leaf-diseases)
vital source of carbohydrates that give them vital energy (Food and Agriculture Organization [FAO],
2013; Khush, 1997).
Especially in Asia, rice farming plays a major role in the economies and cultures of many nations.
It needs particular growing circumstances, such as much water, rich soil, and a warm temperature.
Increased rice yields as a result of improved agricultural techniques have contributed to feeding the
world’s expanding population (International Rice Research Institute, 2020). Furthermore, rice farming
promotes environmental sustainability and biodiversity (Surridge, 2004).
Table B1 in Appendix B provides feature statistics tables for leaf smut disease of rice used in this
study. Figure 3 shows the image processing steps derived from flow chart in Figure 2, with substeps
from step 2 (i.e., image preprocessing). Table 4 provides descriptions for each of the processing steps
as applied to the images for rice leaves.
8
International Journal of Computer Vision and Image Processing
Volume 14 • Issue 1 • January-December 2024
decreases towards 0.3, while test loss fluctuates between 0.4 to 0.6. For the Train-Test Set Division,
a 20-80 split shows a Mean F1-Score between 0.90 to 0.96, a 40-60 split ranges from 0.92 to 0.95, a
50-50 split from 0.90 to 0.94, a 60-40 split from 0.88 to 0.92, and an 80-20 split from 0.88 to 0.92.
In terms of Dataset Type, color datasets have a Mean F1-Score fluctuating between 0.90 to 0.96,
grayscale datasets from 0.90 to 0.95, and segmented datasets from 0.92 to 0.96.
Steps Description
Step 1: Original image The initial image shows a rice leaf with scattered marks hinting towards a potential
disease.
Step 2a: Noise reduction The function begins by reducing noise in the image using a Gaussian blur. This effectively
smooths the image, reducing noise.
Step 2b: Contrast Contrast limited adaptive histogram equalization (CLAHE) is applied to the L channel.
enhancement CLAHE is a variant of adaptive histogram equalization that limits the contrast
amplification to reduce noise.
Step 3: Improved The Python function, segment_image, is designed to segment an image based on a
segmentation specified color range in the hue saturation value (HSV) color space that divides the image
into three different clusters (Figure 3).
Step 4: Feature extraction Python function, extract_features, is used for this step as it is designed to extract specific
features from a segmented image based on a provided mask. The function returns two
features: The number of segmented pixels (num_pixels) and the mean color of the
segmented region (mean_color).
Step 5: Classification The classify_disease python function is used as it is designed to classify the health status
of an image based on the area of a segmented region and its mean color. Additionally, it
creates a classification image with text indicating the disease status.
Step 6: Disease Presented below in Figure 5 and Table 5.
identification
9
International Journal of Computer Vision and Image Processing
Volume 14 • Issue 1 • January-December 2024
Bananas, or Musa spp., are a popular fruit that are consumed all over the world and are
distinguished by their unique shape, sweet flavor, and yellow peel. Bananas, which are members of
the Musaceae family, are native to tropical and subtropical areas. They provide vital nutrients such
as potassium, vitamin C, and dietary fiber, making them a staple food in many parts of the world
(Robinson & Sauco, 2010).
Many developing nations, especially those in Latin America, Africa, and Southeast Asia, rely
heavily on banana cultivation for their economies. In order to grow properly, the crop needs warm
temperatures, high humidity, and well-drained soil. Improvements in farming methods and disease
control have raised banana yields and quality, guaranteeing a consistent supply to fulfill demand
worldwide (Smith et al., 2007). Furthermore, in many rural communities, bananas are essential for
generating income and ensuring food security [Food and Agriculture Organization (FAO), 2019].
Table B2 in Appendix B provides feature statistics tables for black sigatoka disease of banana
used in this study. Figure 6 shows the image processing steps derived from the flow chart in Figure
2, with sub- steps from step 2 (i.e., image preprocessing). Table 6 provides descriptions for each of
the processing steps as applied to the images of banana leaf.
10
International Journal of Computer Vision and Image Processing
Volume 14 • Issue 1 • January-December 2024
Figure 6. Image processing steps as shown for two different original images of banana leaves
Note. Step 1: original; step 2a: noise reduction; step 2b: contrast enhancement; step 3: improved
segmentation; step 4: feature extraction; step 5: classification. Source of Original image (Step 1):
(Kaggle (2023): https://w
ww. kaggle. com/d atasets/s ujaykapadnis/b anana- disease- recognition- dataset)
Steps Description
Step 1: Original image The initial image showed a banana leaf with prominent large spots and discoloration.
Step 2a: Noise reduction The cv2.GaussianBlur(image, (5, 5), 0) function applies a Gaussian filter to the
image. The (5, 5) kernel size is used specifying the width and height of the filter
which results in a denoised image.
Step 2b: Contrast enhancement The clahe = cv2.createCLAHE (clipLimit=3.0, tileGridSize=(8, 8)) function is used
to create a Contrast Limited Adaptive Histogram Equalization (CLAHE) object with a
clipLimit of 3.0 (which prevents over-amplification of contrast).
Step 3: Improved segmentation The function begins by converting the input image from the BGR (i.e.,
blue-green-red) color space to the Hue Standard Value (HSV) color space. It separates
the color information (hue) from the intensity information (value), making it easier to
define color ranges dividing the image into separate clusters (Figure 5).
Step 4: Feature extraction This operation keeps only the pixels in the image that correspond to the white areas in
the mask, effectively isolating the segmented regions.
Step 5: Classification This function classifies the health status of an image based on the area and mean
color of a segmented region, and it creates an annotated image with the classification
result.
Step 6: Disease identification Presented below in Figure 8 and Table 7.
11
International Journal of Computer Vision and Image Processing
Volume 14 • Issue 1 • January-December 2024
and an 80-20 split shows a Mean F1-Score of 0.90. In terms of Dataset Type, color datasets have a
12
International Journal of Computer Vision and Image Processing
Volume 14 • Issue 1 • January-December 2024
Mean F1-Score of 0.92, grayscale datasets have a Mean F1-Score of 0.90, and segmented datasets
have a Mean F1-Score of 0.94.
The well-known and extensively grown sunflower (Helianthus annuus) is distinguished by its
large and brilliant yellow flowers and tall stems. Sunflowers are native to North America and are now
grown all over the world. They are members of the Asteraceae family. In addition to their aesthetic
appeal, they are prized for their edible seeds, which contain an abundance of oil and essential nutrients
such as vitamin E, magnesium, and selenium (Fick & Miller, 1997).
In many agricultural economies, especially in Argentina, Russia, and the Ukraine, sunflower
cultivation is important. The crop is a flexible option for farmers because it can be grown in a variety
of climates and soil types. The main reason sunflowers are grown is for their seeds, which are
refined into oil and used in industrial, cosmetic, and culinary applications. Sunflowers are important
economically because of increased yields and resistance to disease brought about by improvements in
breeding and farming practices (Skoric, 2012). Sunflowers also contribute to pollinator populations
and biodiversity (McGregor, 1976).
Table B3 in Appendix B provides feature statistics tables for leaf scars disease of sunflower
used in this study. Figure 9 shows the image processing steps derived from the flow chart in Figure
2, with sub-steps from step 2 (i.e., image preprocessing). Table 8 provides descriptions for each of
the processing steps as applied to the images of sunflower leaves.
13
International Journal of Computer Vision and Image Processing
Volume 14 • Issue 1 • January-December 2024
Figure 9. Image processing steps as shown for two different original images of sunflower leaves
Note. Step 1: original; step 2a: noise reduction; step 2b: contrast enhancement; step 3: improved
segmentation; step 4: feature extraction; step 5: classification. Source of Original image (Step 1):
(Kaggle (2022) https://www.kaggle.com/datasets/noamaanabdulazeem/sunflower-fruits-and-leaves
-dataset)
Steps Description
Step 1: Original image The initial image had widespread spots of varying sizes on the leaf of sunflower.
Step 2a: Noise reduction Technique used: Nonlocal means denoising.
Process:
• Converted the image to an array.
• Applied denoising with OpenCV.
• Converted back to an image.
Step 2b: Contrast enhancement Technique used: Image enhance contrast. Applied a contrast factor to stretch the
contrast range.
Step 3: Improved segmentation The segment_image function is used here as it creates a binary mask using cv2.
inRange(hsv, lower_bound, upper_bound). This function checks each pixel in the
HSV image to see if its values fall within the specified range and returns the binary
mask, which can be used to isolate or highlight the regions of the original image that
fall within the specified color range (Figure 7).
Step 4: Feature extraction Technique used: Contour detection and region analysis.
Process:
• Labeled the connected components.
• Extracted and drew contours around each spot.
Step 5: Classification The classification_text function is used, as it creates a string with the disease status
and the affected percentage formatted to two decimal places.
Step 6: Disease identification Presented below in Figure 11 and Table 9.
14
International Journal of Computer Vision and Image Processing
Volume 14 • Issue 1 • January-December 2024
Due to their versatility and high nutritional content, potatoes (Solanum tuberosum) are a staple
food crop that are grown and consumed all over the world. They are indigenous to the Andean region
of South America and are members of the Solanaceae family. In addition to being high in essential
nutrients such as vitamin C, potassium, and dietary fiber, potatoes are also high in carbohydrates,
especially starch (CIP International Potato Center, 2018). They are a versatile ingredient in many
cuisines because of the variety of ways they can be prepared, such as boiling, baking, frying, and
mashing.
Many nations’ agricultural economies depend heavily on the production of potatoes, especially
those in temperate climates. The crop can be grown in a variety of settings, from big commercial
farms to tiny garden plots, and is generally easy to grow. Potato yields and disease resistance have
increased dramatically, due to improvements in breeding, pest control, and farming practices; as a
result, there is a consistent supply to meet global demand [Food and Agriculture Organization (FAO),
2008]. Furthermore, potatoes are essential for food security because they give millions of people
worldwide a consistent source of income and nutrition (Scott & Suarez, 2012).
Table B4 in Appendix B provides feature statistics tables for late blight disease of potato used
in this study. Figure 12 shows the image processing steps derived from the flow chart in Figure 2,
15
International Journal of Computer Vision and Image Processing
Volume 14 • Issue 1 • January-December 2024
with sub-steps from step 2 (i.e., image preprocessing). Table 10 provides descriptions for each of the
processing steps as applied to the images of potato leaves.
16
International Journal of Computer Vision and Image Processing
Volume 14 • Issue 1 • January-December 2024
Figure 12. Image processing steps as shown for two different original images of potato leaves
Note. Step 1: original; step 2a: noise reduction; step 2b: contrast enhancement; step 3: improved
segmentation; step 4: feature extraction; step 5: classification. Source of Original image (Step 1):
(Kaggle (2019) https://www.kaggle.com/datasets/arjuntejaswi/plant-village/)
Steps Description
Step 1: Original image The initial image showed a potato leaf with large spots marking a potential disease.
Step 2a: Noise reduction Bilateral filtering is used to reduce noise while keeping edges sharp as it uses both
spatial and color information to smooth the image while preserving edges.
Step 2b: Contrast enhancement The clahe.apply function is used to apply the CLAHE algorithm to the L channel,
enhancing its contrast.
Step 3: Improved segmentation The Python function segment_image is used, as it defines a lower and upper bound
for the HSV values that correspond to the colors of interest. In this case, the color
range is set to capture typical disease colors as highlighted in the clusters below
(Figure 9).
Step 4: Feature extraction Technique used: Contour detection and region properties analysis.
Process:
• Labeled connected components.
• Extracted contours and drew bounding boxes.
Step 5: Classification The function uses predefined thresholds to classify the disease status based on the
area of the segmented region and the mean color (specifically, the red component) of
that region.
Step 6: Disease identification Presented below in Figure 14 and Table 11.
We present and discuss below step 6, as provided in Figure 2, for each of the four plants, namely,
rice, banana, sunflower and potato. We completed this part of our research using the architecture of
Neural network AlexNet and GoogLeNet, in which we identified plant diseases with an astounding
overall accuracy rate of 91.90%.
17
International Journal of Computer Vision and Image Processing
Volume 14 • Issue 1 • January-December 2024
Figure 14. Visual performance metrics for Late Blight Disease of Potato
18
International Journal of Computer Vision and Image Processing
Volume 14 • Issue 1 • January-December 2024
Table 11. Statistical performance metrics for Late Blight Disease of Potato
Deep learning architectures refer to the specific neural network models used for training. Common
architectures include AlexNet, GoogLeNet and ResNet. Different architectures have unique structures
and capabilities. For instance, GoogLeNet uses an Inception module to achieve higher accuracy
(Educative, 2024), while AlexNet is known for its simpler yet effective design (Analytics Vidhya,
2024). The choice of architecture can significantly influence the model's performance.
• Training from Scratch: The model is trained from randomly initialized weights.
• Finetuning (Transfer Learning): The model is pretrained on a large dataset and then fine-tuned
on the specific dataset of interest.
Training from scratch can be time-consuming and requires a large dataset to achieve good
performance. Finetuning leverages pre-trained weights, often resulting in faster convergence and
better performance, especially when data is limited.
8.1.2 Loss
Loss is a measure of how well the model's predictions match the actual labels. Common loss
functions include Cross-Entropy Loss for classification and Mean Squared Error for regression.
Monitoring the loss during training helps in understanding how well the model is learning. Lower
training and testing loss values indicate better model performance and generalization.
19
International Journal of Computer Vision and Image Processing
Volume 14 • Issue 1 • January-December 2024
Table 12. Overall accuracy of statistical performance metrics of four plant types investigated
The choice of dataset type can affect the model's ability to learn relevant features. Color images
provide more information, potentially improving performance. Grayscale images reduce computational
complexity. Segmented images can help focus the model on critical areas, enhancing detection
capabilities.
8.2 Results of Using GoogLeNet and AlexNet for Step 6: Disease Identification
Table 12 below shows the overall accuracy for the statistical performance metrics of the four
plant types investigated with the results of using GoogLeNet and AlexNet. These results shown are
from using different performance metrics for step 6: disease Identification in the flow chart for leaf
disease detection as shown in figure 2.
Table 14 provides total pixels after classification, the number and percentages of healthy pixels
and the number and percentages of diseased pixels of the selected leaf images provided above. The
following section provides a summary of image processing results for these diseased leaves to classify
the healthy and diseased percentages of the leaves, and other characteristics.
The analysis of the provided leaf images involved several steps, including noise reduction,
contrast enhancement, segmentation, feature extraction, and classification. We compiled the results
from these steps showing pixel counts and area percentages for each image.
20
International Journal of Computer Vision and Image Processing
Volume 14 • Issue 1 • January-December 2024
21
International Journal of Computer Vision and Image Processing
Volume 14 • Issue 1 • January-December 2024
Leaf disease Figure Total pixels after Healthy pixels Diseased pixels
classification
Leaf smut (rice) Figure 2 (Top) 45,936 16,583 (36.10%) 29,353 (63.90%)
Leaf smut (rice) Figure 2 (Bottom) 55,750 32,775 (58.79%) 22,975 (41.21%)
Rice average 50,843 24,679 (47.46%) 26,164 (52.5%)
Black sigatoka (banana) Figure 4 (Top) 67,077 19,714 (29.39%) 47,363 (70.61%)
Black sigatoka (banana) Figure 4 (Bottom) 65,024 5,325 (8.19%) 59,699 (91.81%)
Banana average 66,050 12,519 (18.79%) 53,531 (81.21%)
Leaf scars (sunflower) Figure 6 (Top) 89,693 34,738 (38.73%) 54,955 (61.27%)
Leaf scars (sunflower) Figure 6 (Bottom) 89,873 20,024 (22.28%) 69,849 (77.72%)
Sunflower average 89,783 27,381 (30.50%) 62,402 (69.50%)
Late blight (potato) Figure 8 (Top) 67,599 43,730 (64.69%) 23,869 (35.31%)
Late blight (potato) Figure 8 (Bottom) 65,792 38,534 (58.57%) 27,257 (41.43%)
Potato average 66,695 41,132 (61.63%) 25,563 (38.37%)
Overall averages N/A 60,749 26,428 (39.63%) 41,915 (60.37%)
Since the sample size of this analysis is only limited to two images per plant leaf as shown in
Table 14, no major conclusions can be made. However, using further analysis of this method with
larger datasets will open the possibility to much more oversight of image processing results for the
research on plant diseases.
22
International Journal of Computer Vision and Image Processing
Volume 14 • Issue 1 • January-December 2024
• Disease Severity: The comparative analysis shows that some plants are more severely affected
than others. For example, the plant with 8.19% healthy area is much worse off compared to the
one with 64.69% healthy area.
• Spot Characteristics: Variability in the number and size of spots indicates different stages or
types of disease manifestation. Plants with many small spots might be experiencing an early
stage of the disease, while those with fewer and larger spots might be in a more advanced stage.
• Impact on Health: Overall, the analysis underscores that the disease’s impact is not uniform
across all plants. Some plants maintain a relatively higher proportion of healthy areas, which
could suggest better resistance or earlier detection and intervention.
11. CONCLUSION
Our Image processing technique achieved an overall accuracy of 91.90% in disease detection
across four different plant types. The analysis revealed that rice leaves had an average of 52.5% diseased
pixels, while banana had 81.21%, sunflower leaves 69.50% and potato leaves 38.37%. These findings
highlight the variability in disease impact across different plants and underscore the importance of
tailored intervention strategies.
The research presented illustrates the significant potential of image processing techniques in
enhancing the early detection and management of plant diseases, which is crucial for agricultural
productivity. By utilizing the k-means algorithm for color segmentation and the GLCM (Gray Level
Co-occurrence Matrix) for disease classification, the automated approach detailed in this study offers
a more efficient alternative to traditional manual inspection methods. We successfully applied the
methodology to four different plant types of grain, fruit, flower and vegetable as represented by —rice,
banana, sunflower, and potato, respectively—demonstrating its versatility and effectiveness across
diverse agricultural crops. The image processing steps, from noise reduction to feature extraction
and classification, allowed for precise identification and analysis of diseased areas, highlighting the
severity and spread of the diseases in a quantifiable manner.
As shown in Appendix for the datasets used to test image processing models: The number of
images available on Kaggle.com repository for leaf smut folder for rice disease is 40 images, the
number of images available on Kaggle.com repository for black sigatoka for banana is 67 images, the
23
International Journal of Computer Vision and Image Processing
Volume 14 • Issue 1 • January-December 2024
number of images available on Kaggle.com repository for leaf scars for sunflower is 140 images, and
the number of images available on Kaggle.com repository for late blight disease for potato is 1000
images. It would be infeasible to present image processing results for each of these images for each
of the steps of the algorithm. Hence, one of the future directions of this research is to continue to
process additional samples and compare the image processing results that we believe would provide
comparable results.
Looking ahead, the system holds potential for further development into a real-time analysis
platform. We've successfully combined four different plant categories—grain, fruit, flower, and
vegetable—into our disease detection system with astounding results. This shows just how flexible and
reliable our models are when it comes to identifying diseases in various plants. Moving forward, we
can develop these models further by adding more detailed datasets to handle a wider variety of plant
diseases and environmental conditions, including different lighting scenarios. With more resources,
we can make our disease detection model more dependable. This means gathering specific information
about individual plant diseases and training our models to recognize each one.
The more innovative approach will be merging these specialized models together. By doing
this, we can create a powerful platform that can diagnose a wide range of plant diseases with great
precision. This approach will not only enhance the accuracy of disease detection but also allow us to
handle larger samples. Our ultimate goal is to develop a comprehensive tool that will be invaluable
for farmers, researchers, and anyone involved in agriculture. By providing accurate and timely disease
diagnoses, we can help ensure healthier crops, better yields, and more sustainable farming practices.
CONFLICT OF INTERESTS
FUNDING
Dr. Richard Segall would like to acknowledge the support of the Neil Griffin College of Business
(NGCoB) at Arkansas State University (A-STATE) in Jonesboro for a 2024 Summer Research Grant
Award. Prasanna Rajbhandari would like to acknowledge a 2024 Summer Undergraduate Internship
awarded by Arkansas Biosciences Institute (ABI) that has headquarters located on the campus of
Arkansas State University in Jonesboro.
PROCESS DATES
CORRESPONDING AUTHOR
24
International Journal of Computer Vision and Image Processing
Volume 14 • Issue 1 • January-December 2024
REFERENCES
Albattah, W., Javed, A., Nawaz, M., Masood, M., & Albahli, S. (2022). Artificial intelligence-based drone
system for multiclass plant disease detection using an improved efficient convolutional neural network. Frontiers
in Plant Science, 13, 808380. Advance online publication. DOI:10.3389/fpls.2022.808380 PMID:35755664
Aldakheel, E. A., Zakariah, M., & Alabdalall, A. H. (2024). Detection and identification of plant leaf diseases
using YOLOv4. Frontiers in Plant Science, 15, 1355941. DOI:10.3389/fpls.2024.1355941 PMID:38711603
Analytics Vidhya. (2024). Introduction to the architecture of AlexNet. Retrieved from https://www.analyticsvidhya
.com/blog/2021/03/introduction-to-the-architecture-of-alexnet/
Arya, S., & Singh, R. (2019). A comparative study of CNN and AlexNet for detection of disease in potato and
mango leaf. In Proceedings of 2nd International Conference on Issues and Challenges in Intelligent Computing
Techniques (ICICT) (pp. 1—6). DOI:10.1109/ICICT46931.2019.8977648
Bharate, A. A., & Shirdhonkar, M. S. (2017). A review on plant disease detection using image processing. In
Proceedings of the 2017 International Conference on Intelligent Sustainable Systems (ICISS), (pp. 103—109).
DOI:10.1109/ISS1.2017.8389326
Chen, H.-C., Widodo, A. M., Wisnujati, A., Rahaman, M., Lin, J. C.-W., Chen, L., & Weng, C.-E. (2022). AlexNet
convolutional neural network for disease detection and classification of tomato leaf. Electronics (Basel), 11(6),
951. DOI:10.3390/electronics11060951
Chollet, F. (2015). Keras: The Python deep learning API. https://keras.io
Choudhury, S. S., Samal, A., & Awada, T. (2019). Leveraging image analysis for high-throughput plant
phenotyping. Frontiers in Plant Science, 10, 508. Advance online publication. DOI:10.3389/fpls.2019.00508
PMID:31068958
CIP International Potato Center. (2018). Potato facts and figures. https://cipotato.org/potato
Demilie, W. B. (2024). Plant disease detection and classification techniques: A comparative study of the
performances. Journal of Big Data, 11(1), 5. Advance online publication. DOI:10.1186/s40537-023-00863-9
Devaraj, A., Rathan, K., Jaahnavi, S., & Indira, K. (2019). Identification of plant disease using image processing
technique. In Proceedings of the 2019 International Conference on Communication and Signal Processing
(ICCSP) (pp. 749—753). DOI:10.1109/ICCSP.2019.8698056
Devi, S. Ritika, & Gupta, B. (2019). GLCM-LBP plant leaf disease detection. International Journal of Scientific
Research and Engineering Development, 2(3), 136—140. https://ijsred.com/volume2/issue3/IJSRED-V2I3P19
.pdf
Educative (2024). What is GoogLeNet? Retrieved from https://www.educative.io/answers/what-is-googlenet
Feng, L. (2022). Application analysis of artificial intelligence algorithms in image processing. Mathematical
Problems in Engineering, 2022(1), 7382938. DOI:10.1155/2022/7382938
Fick, G. N., & Miller, J. F. (1997). Sunflower breeding. In Schneiter, A. A. (Ed.), Sunflower technology and
production (pp. 395–440). American Society of Agronomy.
Food and Agriculture Organization. (2008). International year of the potato 2008. Food and Agriculture
Organization of the United Nations. https://www.fao.org/potato-2008/en/
Food and Agriculture Organization. (2013). Rice market monitor. Food and Agriculture Organization of the
United Nations. https://w
ww. fao. org/e conomic/e st/p ublications/r ice- publications/r ice- market- monitor- rmm/e n/
Food and Agriculture Organization. (2019). Banana market review. Food and Agriculture Organization of the
United Nations. https://www.fao.org/economic/est/est-commodities/bananas/en/
Fulari, U., Shastri, R., & Fulari, A. (2020). Leaf disease detection using machine learning. Journal of Seybold
Report, 15(9), 1828–1832. https://www.researchgate.net/publication/344282301_Leaf_Disease_Detection
_Using_Machine_Learning
25
International Journal of Computer Vision and Image Processing
Volume 14 • Issue 1 • January-December 2024
Gehan, M. A., Fahlgren, N., Abbasi, A., Berry, J. C., Callen, S. T., Chavez, L., Doust, A. N., Feldman, M. J.,
Gilbert, K. B., Hodge, J. G., Hoyer, J. S., Lin, A., Liu, S., Lizárraga, C., Lorence, A., Miller, M., Platon, E.,
Tessman, M., & Sax, T. (2017). PlantCV v2: Image analysis software for high-throughput plant phenotyping.
PeerJ, 5, e4088. DOI:10.7717/peerj.4088 PMID:29209576
Gupta, A., Kumar, M. S., Kumar, M. R., & Kumar, D. H. (2023). Deep learning technique used for tomato
and potato plant leaf disease classification and detection. In Proceedings of the 2023 International Conference
on Smart Systems for applications in Electrical Sciences (ICSSES) (pp. 1—6). IEEE. DOI:10.1109/
ICSSES58299.2023.10199327
Hall, R. (2022, January 24). Arkansas Plant Health Clinic’s updated Plant Disease Image Database now
available. https://w
ww. uaex. uada. edu/m
edia- resources/n ews/2 022/0 1- 24- 2022- Ark- plant- disease- database. aspx
Harpale, A. (2024). Crop disease detection. https://github.com/aish-where-ya/Crop-Disease-Detection
Harpale, A. (n.d.). Plant disease detection using an IoT device. https://aish-where-ya.github.io/portfolio/plant
-disease/
Hasan, M. N., Mustavi, M., Jubaer, M. A., Shahriar, M. T., & Ahmed, T. (2022). Plant leaf disease detection
using image processing: A comprehensive review. Malaysian Journal of Science and Advanced Technology,
2(4), 174–182. DOI:10.56532/mjsat.v2i4.80
Hossain, M. S., Ahmed, S. I., Nadim, Md., Rahman, M. M., Shenjuti, M. M., Hasan, M., Jabid, T., Islam, M.,
& Ali, M. S. (2023). DeepCONVSVM: A comprehensive model for detecting disease in mango leaves. In
Proceedings of the 2023 4th International Conference on Big Data Analytics and Practices (IBDAP) (pp. 1—6).
DOI:10.1109/IBDAP58581.2023.10272007
Hossain, S., Mou, R. M., Hasan, M. M., Chakraborty, S., & Razzak, M. A. (2018). Recognition and detection of
tea leaf’s diseases using support vector machine. In Proceedings of the 2018 IEEE 14th International Colloquium
on Signal Processing & Its Applications (CSPA) (pp. 150—154). DOI:10.1109/CSPA.2018.8368703
International Rice Research Institute. (2020). Rice knowledge bank. https://www.irri.org/rice-knowledge-bank
Islam, M., Dinh, A., Wahid, K., & Bhowmik, P. (2017). Detection of potato diseases using image segmentation
and multiclass support vector machine. In Proceedings of the 2017 IEEE 30th Canadian Conference on Electrical
and Computer Engineering (CCECE) (pp. 1—4). DOI:10.1109/CCECE.2017.7946594
Jafar, A., Bibi, N., Ali Naqvi, R. A., Sadeghi-Niaraki, A., & Jeong, D. (2024). Revolutionizing agriculture with
artificial intelligence: Plant disease detection methods, applications, and their limitations. Frontiers in Plant
Science, 15, 1356260. Advance online publication. DOI:10.3389/fpls.2024.1356260 PMID:38545388
Joon, B., Kumar, R., & Jaiswal, A. (2024). Smartplantcare: Plant disease detection using machine learning
algorithms. Journal of Engineering Design and Computational Science, 3(3).
Kaggle (2019). Plant Village Dataset. https://www.kaggle.com/datasets/arjuntejaswi/plant-village/
Kaggle (2020). Rice Leaf Diseases Dataset. https://www.kaggle.com/datasets/vbookshelf/rice-leaf-diseases
Kaggle (2022). Sunflower Fruits and Leaves Dataset. https://www.kaggle.com/datasets/noamaanabdulazeem/
sunflower-fruits-and-leaves-dataset
Kaggle (2023). Banana Disease Recognition Dataset. https://www.kaggle.com/datasets/sujaykapadnis/banana
-disease-recognition-dataset
Kakran, N., Singh, P. K., & Jayant, P. (2019). Detection of disease in plant leaf using image segmentation.
International Journal of Computer Applications, 178(35), 29–32. DOI:10.5120/ijca2019919229
Keim, R., & Humphrey, W. A. (1987). Diagnosing ornamental plant diseases: An illustrated handbook. University
of California Division of Agriculture and Natural Resources.
Khirade, S. D., & Patil, A. B. (2015). Plant disease detection using image processing. In Proceedings of the 2015
International Conference on Computing Communication Control and Automation (pp. 768—771). DOI:10.1109/
ICCUBEA.2015.153
26
International Journal of Computer Vision and Image Processing
Volume 14 • Issue 1 • January-December 2024
Khush, G. S. (1997). Origin, dispersal, cultivation, and variation of rice. Plant Molecular Biology, 35(1-2),
25–34. DOI:10.1023/A:1005810616885 PMID:9291957
Krishna, V., Raghavendran, C. V., & Faruk, S. K. U. (2024). Novel computer vision and color image segmentation
for agricultural application, In Proceedings of 1st International Conference on Disruptive Technologies in
Computing and Communication Systems (ICDTCCS - 2023), (paper 19, pp. 115—120).
Kulkarni, K., Karwande, A., Kolhe, T., Kamble, S., Joshi, A., & Wyawahare, M. (2021). Plant disease detection
using image processing and machine learning. https://arxiv.org/pdf/2106.10698
Kumar, R., & Jindal, V. (2022). A survey of plant disease detection techniques based on image processing and
machine learning. International Journal of Health Sciences, 6(S6), 1954–1967. DOI:10.53730/ijhs.v6nS4.10096
Kumar, S., & Kaur, R. (2015). Plant disease detection using image processing: A review. International Journal
of Computer Applications, 124(16), 6–9. DOI:10.5120/ijca2015905789
Lai, A. M., & Jadhav, S. (2022). Multi-class plant leaf disease detection using a deep convolutional neural network.
International Journal of Information System Modeling and Design, 13(1), 1–14. DOI:10.4018/IJISMD.315126
Li, H.-A., Zheng, Q., Qi, X., Yan, W., Wen, Z., Li, N., Tang, C., & Ahmed, S. H. (2021). Neural network-based
mapping mining of image style transfer in big data systems. Computational Intelligence and Neuroscience,
2021(1), 8387382. Advance online publication. DOI:10.1155/2021/8387382 PMID:34475949
Li, Z., Guo, R., Li, M., Chen, Y., & Li, G. (2020). A review of computer vision technologies for plant
phenotyping. Computers and Electronics in Agriculture, 176, 105672. Advance online publication. DOI:10.1016/j.
compag.2020.105672
Liu, J., & Wang, X. (2021). Plant diseases and pests detection based on deep learning: A review. Plant Methods,
17(22), 22. Advance online publication. DOI:10.1186/s13007-021-00722-9 PMID:33627131
Lu, Y., Yi, S., Zeng, N., Liu, Y., & Zhang, Y. (2017). Identification of rice diseases using deep convolutional
neural networks. Neurocomputing, 267, 378–384. DOI:10.1016/j.neucom.2017.06.023
Mangina, E., Burke, E., Matson, R., O’Briain, R., Caffrey, J. M., & Saffari, M. (2022). Plant species detection
using image processing and deep learning: A mobile-based application. In Bochtis, D. D., Moshou, D. E.,
Vasileiadis, G., Balafoutis, A., & Pardalos, P. M. (Eds.), Information and communication technologies for
agriculture—Theme II: Data. Springer., DOI:10.1007/978-3-030-84148-5_5
Manisha, J. A., Pattavi, J. T., Pragita, J. V., & Aryan, C. C. (2016). Plant disease detection and its treatment using
image processing. International Journal of Advanced Research in Electrical, Electronics, and Instrumentation
Engineering, 5(3). Advance online publication. DOI:10.15662/IJAREEIE.2016.0503025
Mannu, S. D., & Jai, S. (2022). Plants diseases detection: (A brief review). International Journal of Innovative
Science and Research Technology, 7(9), 98–102. https://w ww. scribd. com/d ocument/5 95607130/P
lants- Diseases
-Detection-a-Brief-Review
Maragathavalli, P., & Jana, S. (2023). A review of plant disease detection methods using image processing
approaches. In Proceedings of the 9th International Conference on Advanced Computing and Communication
Systems (ICACCS) (pp. 1306—1313). DOI:10.1109/ICACCS57279.2023.10112981
MathWorks. (2024.). MATLAB and Simulink for image processing. https://www.mathworks.com/discovery/
what-is-matlab.html
McGregor, S. E. (1976). Insect pollination of cultivated crop plants. U.S. Department of Agriculture.
Metre, V. A., & Sawarkar, S. D. (2022). Scope of optimization in plant leaf disease detection using deep learning
and swarm intelligence. In Agarwal, S., Gupta, M., Agrawal, J., Le, D.-N., & Gupta, K. K. (Eds.), Swarm
intelligence and machine learning (pp. 179–204). CRC Press. DOI:10.1201/9781003240037-11
Minch, (2023). Plant disease detection using image processing. Journal of Advanced Research in Intelligence
Systems and Robotics, 5(2). https://www.adrjournalshouse.com/index.php/Intelligence-Robotics-Sysytem/
article/view/1878
27
International Journal of Computer Vision and Image Processing
Volume 14 • Issue 1 • January-December 2024
Mitra, D., & Gupta, S. (2023). Comparison of proposed plant leaf diseases detection algorithm with existing
state-of-the-art techniques. In Proceedings of the 2023 International Conference on Computing, Communication,
and Intelligent Systems (ICCCIS), (pp. 472—477). DOI:10.1109/ICCCIS60361.2023.10425355
Mohanty, S. P., Hughes, D. P., & Salathé, M. (2016). Using deep learning for image-based plant disease detection.
Frontiers in Plant Science, 7, 1419. DOI:10.3389/fpls.2016.01419 PMID:27713752
Munigala, S., Nampelli, P., & Tandra, P. (2021). Detection of plant diseases using data mining techniques:A
literature survey. In Communications Software and Networks, Lecture Notes in Networks and Systems 134, (pp.
39—46). DOI:10.1007/978-981-15-5397-4_5
Nagila, A., & Mishra, A. K. (2023). The effectiveness of machine learning and image processing in detecting
plant leaf disease. The Scientific Temper, 14(1), 8–13. DOI:10.58414/SCIENTIFICTEMPER.2023.14.1.02
Open, C. V. (2024.). OpenCV-Python Tutorials. https://docs.opencv.org/master/d6/d00/tutorial_py_root.html
Patil, B., Panchal, H., Yadav, S., Singh, A., & Patil, S. (2017). Plant monitoring using image processing, Raspberry
Pi & IOT. International Research Journal of Engineering and Technology, 10(10), 1337–1342.
Picon, A., Alvarez-Gila, A., Seitz, M., Ortiz-Barredo, A., Echazarra, J., & Johannes, M. (2019). Deep
convolutional neural networks for mobile capture device-based crop disease classification in the wild. Computers
and Electronics in Agriculture, 161, 280–290. DOI:10.1016/j.compag.2018.04.002
Rahaman, M. M., Ahsan, M. A., & Chen, M. (2019). Data-mining techniques for image-based plant phenotypic
traits identification and classification. Scientific Reports, 9(1), 19526. Advance online publication. DOI:10.1038/
s41598-019-55609-6 PMID:31862925
Ramamoorthy, R., Kumar, E. S., Naidu, R. Ch. A., & Shruthi, K. (2023). Reliable and accurate plant leaf disease
detection with treatment suggestions using enhanced deep learning techniques. SN Computer Science, 2(2), 158.
Advance online publication. DOI:10.1007/s42979-022-01589-w
Robinson, J. C., & Sauco, V. G. (2010). Bananas and plantains. CABI Digital Librar y.
DOI:10.1079/9781845936587.0000
Sahu, P., Chug, A., Singh, A. P., Singh, D., & Singh, R. P. (2021). Challenges and issues in plant disease detection
using deep learning. In Dua, M., & Jain, A. (Eds.), Handbook of research on machine learning techniques for
pattern recognition and information security (pp. 56–74). IGI Global., DOI:10.4018/978-1-7998-3299-7.ch004
Saiwa. (2023, October 14). Plant disease detection using image processing - transforming agriculture. https://
saiwa.ai/blog/plant-disease-detection-using-image-processing/
Scott, G. J., & Suarez, V. (2012). The rise of Asia as the centre of global potato production and some implications
for industry. Potato Journal, 39(1), 1–22.
Shifani, S. A., & Ramkumar, G. (2019). Review on plant disease detection using image processing techniques.
LAP Lambert Academic Publishing.
Simpllearn (2024). What Is Image Processing: Overview, Applications, Benefits, and More. Retrieved from
https://www.simplilearn.com/image-processing-article
Singh, V., & Misra, A. K. (2017). Detection of plant leaf diseases using image segmentation and soft computing
techniques. Information Processing in Agriculture, 4(1), 41–49. DOI:10.1016/j.inpa.2016.10.005
Sinha, A., & Shekhawat, R. S. (2020). Review of image processing approaches for detecting plant diseases. IET
Image Processing, 14(8), 1427–1439. DOI:10.1049/iet-ipr.2018.6210
Skoric, D. (2012). Sunflower breeding. Serbian Academy of Sciences and Arts.
Smith, M. K., Hamill, S. D., & Daniells, J. W. (2007). Advances in banana cultivation under tissue culture and
disease management. Plant Cell, Tissue and Organ Culture, 88(2), 123–132.
Song, C., Wang, C., & Yang, Y. (2020). Automatic detection and image recognition of precision agriculture for
citrus diseases. In Proceedings of the 2020 IEEE Eurasia Conference on IOT, Communication and Engineering
(ECICE), (pp. 187—190). DOI:10.1109/ECICE50847.2020.9301932
28
International Journal of Computer Vision and Image Processing
Volume 14 • Issue 1 • January-December 2024
Sridhathan, C., & Kumar, M. S. (2018). Plant infection detection using image processing. International Journal
of Modern Engineering Research, 8(7), 13–18.
Supriya, S., & Aravinda, H. L. (2022). Green leaf disease detection and identification using Raspberry Pi.
International Research Journal of Engineering and Technology, 9(8), 279–287.
Surridge, C. (2004). Rice cultivation and biodiversity. Nature, 428(6982), 360–361. DOI:10.1038/428360a
PMID:15042057
Thangavel, M., Gayathri, P. K., Sabari, K. R., & Prathiksha, V. (2022). Plant leaf disease detection using deep
learning. International Journal of Engineering Research & Technology (Ahmedabad), 10(8), 34–37.
Tovar, J. C., Hoyer, J. S., Lin, A., Tielking, A., Callen, S. T., Elizabeth Castillo, S., Miller, M., Tessman, M.,
Fahlgren, N., Carrington, J. C., Nusinow, D. A., & Gehan, M. A. (2018). Raspberry Pi-powered imaging for
plant phenotyping. Applications in Plant Sciences, 6(3), e1031. DOI:10.1002/aps3.1031 PMID:29732261
Tripathy, S. S., Poddar, R., Satapathy, L., & Mukhopadhyay, K. (2022). Image processing–based artificial
intelligence system for rapid detection of plant diseases. In Sharma, P., Yadav, D., & Gaur, R. K. (Eds.),
Bioinformatics in agriculture (pp. 619–624). Academic Press. DOI:10.1016/B978-0-323-89778-5.00023-4
Tutors India. (2023, January 3). Top 13 image processing tools to expect in 2023. https://www.tutorsindia.com/
blog/top-13-image-processing-tools-to-expect-2023/
Urban, M., Cuzick, A., Seager, J., Wood, V., Rutherford, K., Venkatesh, S. Y., De Silva, N., Martinez, M. C.,
Pedro, H., Yates, A. D., Hassani-Pak, K., & Hammond-Kosack, K. E. (2020). PHI-base: The pathogen–host
interactions database. Nucleic Acids Research, 48(D1), D613–D620. DOI:10.1093/nar/gkz904 PMID:31733065
Urban, M., Cuzick, A., Seager, J., Wood, V., Rutherford, K., Venkatesh, S. Y., Sahu, J., Iyer, S. V., Khamari, L.,
De Silva, N., Martinez, M. C., Pedro, H., Yates, A. D., & Hammond-Kosack, K. E. (2022). PHI-base in 2022: A
multi-species phenotype database for Pathogen-Host Interactions. Nucleic Acids Research, 50(D1), D837–D847.
DOI:10.1093/nar/gkab1037 PMID:34788826
Wang, Y., Lu, Y., & Li, Y. (2019). A new image segmentation method based on support vector machine. In
Proceedings of the 2019 IEEE 4th International Conference on Image, Vision and Computing (ICIVC) (pp.
177—181). DOI:10.1109/ICIVC47709.2019.8981000
Yin, S., Wu, C., Liang, J., Shi, J., Li, H., Ming, G., & Duan, N. (2023). DragNUWA: Fine-grained control in
video generation by integrating text, image, and trajectory. arXiv:2308.08089, 1.
29
International Journal of Computer Vision and Image Processing
Volume 14 • Issue 1 • January-December 2024
APPENDIX A
Steps Description
Step 1: Data 1. Download the datasets from Kaggle.2. Organize the images into folders based on plant
preparation types and their respective disease categories.3. Resize images to a consistent size.4. Normalize
pixel values.5. Split the data into training, validation, and testing sets (e.g., 70% training, 15%
validation, and 15% testing).
Step 2: Model 1. Use pretrained CNNs such as ResNet, VGG (Visual Geometry Group), Inception or custom
training CNN architectures.2. Apply data augmentation techniques to increase the diversity of the training
set (e.g., rotations, flips, zooms).3. Load the training and validation datasets.4. Compile the
model with appropriate loss functions and optimizers (e.g., categorical cross-entropy and Adam
optimizer).5. Train the model using the training set, while validating on the validation set.6.
Monitor performance metrics like accuracy and loss
Step 3: Model 1. Adjust hyperparameters such as learning rate, batch size, and number of epochs to optimize
evaluation model performance.2. Load the test dataset3. Evaluate the model on the test set to determine its
accuracy and other performance metrics (e.g., precision, recall, and F1 score).
30
International Journal of Computer Vision and Image Processing
Volume 14 • Issue 1 • January-December 2024
31
International Journal of Computer Vision and Image Processing
Volume 14 • Issue 1 • January-December 2024
Richard S. Segall is Professor of Information Systems & Business Analytics in Neil Griffin College of Business at
Arkansas State University in Jonesboro. He holds BS/MS in mathematics, a MS in operations research and statistics
from Rensselaer Polytechnic Institute in Troy, New York, and a PhD in operations research form University of
Massachusetts at Amherst. He has served on the faculty of Texas Tech University, University of Louisville, University
of New Hampshire, University of Massachusetts-Lowell, and West Virginia University. His research interests include
data mining, big data, text mining, web mining, database management, and mathematical modeling. His funded
research includes that by United States Air Force (USAF), National Aeronautics and Space Administration (NASA),
Arkansas Biosciences Institute (ABI), and Arkansas Science & Technology Authority (ASTA). He was a member of
former Arkansas Center for Plant-Powered-Production (P3) and is a member of Center for No-Boundary Thinking
(CNBT), serves on the editorial boards of the International Journal of Data Mining, Modelling and Management
(IJDMMM), International Journal of Data Science (IJDS), and International Journal of Fog Computing (IJFC), and
is co-editor of five books: (1.) Biomedical and Business Applications Using Artificial Neural Networks and Machine
Learning, (2.) Open Source Software for Statistical Analysis of Big Data, (3.) Handbook of Big Data Storage and
Visualization Techniques, (4.) Research and Applications in Global Supercomputing, and (5.) Visual Analytics of
Interactive Technologies: Applications to Data, Text & Web Mining.
Prasanna Rajbhandari is originally from Nepal and an undergraduate student majoring in Information Systems &
Business Analytics (ISBA) in Neil Griffin College of Business at Arkansas State University in Jonesboro. He has a
strong passion for data-driven decision making and is preparing for a career in the global business environment.
32