0% found this document useful (0 votes)

5 views

Example

Uploaded by

Satwika Bijja 6606

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

Example

Uploaded by

Satwika Bijja 6606

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

© 2022 JETIR June 2022, Volume 9, Issue 6 www.jetir.

org (ISSN-2349-5162)

Using Deep Learning to Predict Plant Growth and

Yield in Greenhouse Environments
Mudugula Babu, IV B.Tech Student, Dept of IT, Sreenidhi Institute of Science and Technology (A),
Hyderabad. [email protected]

Talla Umesh, IV B.Tech Student, Dept of IT, Sreenidhi Institute of Science and Technology (A),
Hyderabad. [email protected]

Dr. Sreenivas Mekala, Associate Professor, Dept of IT, Sreenidhi Institute of Science and
Technology (A), Hyderabad.

ABSTRACT: transforming the data using various functions that

create data representations in a hierarchical way,
In this paper author is predicting ficus plant
through several levels of abstraction. A strong
growth/crop yield by evaluating performance of
advantage of DL is feature learning, i.e., automatic
various machine learning algorithms such as SVR
feature extraction from raw data, with features in
(Support Vector Regression), Random Forest
higher levels of the hierarchy being formed through
Regression (RF) and LSTM (Long Short Term
composition of lower level features. DL can solve
Memory) deep neural network algorithm. SVR and
complex problems particularly well and fast, due to
RF are the traditional old algorithms whose
the more complex models used, which also allow
performance of prediction will be low due to
massive parallelization. These complex models
unavailable of deep learning technique. To
employed in DL can increase classification
overcome from this problem author is using LSTM
accuracy, or reduce error in regression problems,
deep neural network algorithm to predict plant
provided there are adequately large datasets
growth.
available describing the problem. DL includes
1. INTRODUCTION: different components, such as convolutions, pooling
Deep Learning extends classical ML by adding more layers, fully connected layers, gates, memory cells,
"depth" (complexity) into the model, as well as activation functions, encoding/decoding schemes,

JETIR2206616 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org g107
© 2022 JETIR June 2022, Volume 9, Issue 6 www.jetir.org (ISSN-2349-5162)

depending on the network architecture used, e.g., data-driven modeling approach is capable of
Convolutional Neural Networks, Recurrent Neural formulating a model solely from gathered data
Networks and Unsupervised Networks. without necessarily using domain knowledge. Data
The LSTM model is introduce with the objective of driven models (DDM) include classical Machine
modelling long term dependencies and determining Learning techniques, artifical neural networks
the optimal time lag for time series problems. A (Daniel et al., 2008), support vector machines
LSTM network is composed of one input layer, one (Pouteau et al., 2012), and generalized linear
recurrent hidden layer, and one output layer. The models. Those methods have many desirable
basic unit in the hidden layer is the memory block, characteristics, such as imposing fewer restrictions,
containing memory cells with self-connections or assumptions, the ability to approximate nonlinear
memorizing the temporal state and a pair of functions, strong predictive abilities, and the
adaptive, multiplicative gating units controlling flexibility to adapt to inputs of a multivariate system
information flow in the block. The memory cell is (Buhmann, 2003).
primarily a recurrently self-connected linear unit, According to Singh et al., 2016 and reviewed by
called Constant Error Carousel (CEC), and the cell Liakos et al., 2018 Machine Learning (ML), linear
state is represented by the activation of the CEC. polarizations, wavelet-based filtering, vegetation
The multiplicative gates learn when to open and indices (NDVI) and regression analysis are the most
close. By keeping the network error constant, the popular techniques used for analyzing agricultural
vanishing gradient problem can be solved in LSTM. data. However and besides the aforementioned
Moreover, a forget gate is added to the memory cell techniques, a new methodology which is recently
preventing the gradient from exploding when gaining momentum is deep learning
learning long time series. (DL)(Goodfellow et al., 2016).
2. LITERATURE REVIEW: DL belongs to the machinelearning computational
field and is similar to ANN. However, DL is about
As with many bio-systems, plant growth is a highly
“deeper” neural networks that provide a hierarchical
complex and dynamic environmentally linked
representation of the data by means of various
system. Therefore, growth and yield modeling is a
operations. This allows larger learning capabilities,
significant scientific challenge. Modeling
and thus higher performance and precision. A strong
approaches vary in a number of aspects (including,
advantage of DL is feature learning, i.e., automatic
scale of interest, level of description, integration of
feature extraction from raw data, with features from
environmental stress, etc.). According to
higher levels of the hierarchy being formed by
(Todorovski and Dzeroski, 2006; Atanasova et al.,
composition of lower level features (Goodfellow et
2008) two basic modeling approaches are possible,
al., 2016).
namely, "knowledge-driven" or "data-driven"
DL can solve more complex problems particularly
modeling. The knowledge driven approach relies
well, because of the more complex related models
mainly on existing domain knowledge. In contrast, a

JETIR2206616 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org g108
© 2022 JETIR June 2022, Volume 9, Issue 6 www.jetir.org (ISSN-2349-5162)

(Pan and Yang, 2010). These complex models periodic variation is related to plant water content
employed in DL can increase classification accuracy and can be used as an indicator of the plant water
and reduce error in regression problems, provided content change. During active vegetative growth and
there are adequately large data-sets available development, crop plants rely on the carbohydrate
describing the problem. Gonzalez-Sanchez et al.( gained from photosynthesis and the translocation of
2019) presented a comparative study of ANN, SVR, photo-assimilates from the site of synthesis to sink
M5-prime, KNN ML techniques and Multiple organs (Yu et al., 2015). The fundamentals of stem
Linear Regression for crop yield prediction in ten diameter variations have been well documented in a
crop datasets. In their study, Root Mean Square substantial amount of literature (Vandegehuchet et
Error (RMS), Root Relative Square Error (RRSE), al., 2014).
Normalized Mean Absolute Error (MAE) and It has been documented that SDV is sensitive to
Correlation Factor (R) were used as accuracy water and nutrient conditions and is closely related
metrics to validate the models. Results showed that to the responses of crop plants to the changes of
M5-Prime achieved the lowest errors across the environmental conditions (Kanai et al., 2008). The
produced crop yield models. stem diameter is an important parameter describing
The results of that study ranked the techniques from the growth of crop plants under abiotic stress during
the best to the worst, according to RMSE, RRSE, R, vegetative growth stage. Therefore, it is important to
and MAE resulting, in the following order: M5- generate stem diameter growth models able to
Prime, kNN, SVR, ANN and MLR. Another study predict the response of SDV to environmental
by (Nair and Yang-Won, 2016) applied four ML changes and plant growth under different conditions.
techniques, SVM, Random Forest (RF), Extremely Many studies emphasize the need to critically
Randomized Trees (ERT) and Deep Learning (DL) review and improve SDV models for assessment of
to estimate corn yield in Iowa State. Comparisons of environmental impact on crop growth (Hinckley and
the validation statistics showed that DL provided Bruckerhoff, 2011). SDV daily models have been
more stable results, overcoming the overfitting developed to accurately predict inter-annual
problem. Stem diameter is considered as one of the variation in annual growth in balsam fir (Abies
important parameters describing the growth of balsamea L) (Duchesene and Houle, 2011).
plants during vegetative growth stage. Inclusion of daily data in growth-climate models can
Also, the variation of stem diameter has widely been improve predictions of the potential growth response
used to derive proxies for plant water status and, is to climate by identifying particular climatic events
therefore applied in optimisation strategies for plant- that escape to a classical dendroclimatic approach
based irrigation scheduling in a wide range of (Duchesene and Houle, 2011).
species. Plant stem diameter variation (SDV) refers However, models for predicting SDV and plant
to plant stem periodic shrinkage and recovery growth using environmental variables have so far
movement during the day and night, and this remained limited. Tomato crop growing in

JETIR2206616 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org g109
© 2022 JETIR June 2022, Volume 9, Issue 6 www.jetir.org (ISSN-2349-5162)

greenhouse environment is considered as a dynamic tools that can help farmers in making decisions.
and complex system, with few models having been
studied for it up to now. In the literature TOMGRO
3. METHODOLOGY:
and TOMSIM (Jones et al., 1999), (Heuvelink,
This project consists of following modules
1996) are considered as the main applicable
dynamic growth models. Those models are 1) upload dataset: using this module we will upload

dependent on physiological processes, and they FICUS plant dataset

represent biomass partitioning, crop growth, and 2) Dataset cleaning: using this module we will find

yield as a function of several climate and out empty values in the dataset and replace with

physiological parameters. However, due to their mean or 0 values.

limited application to practicalsettings, their 3) Train & Test Split: Using this module we will

complexity, the difficulty in estimating initial split dataset into two parts called and training

parameter values and the need for calibration and and testing. All machine learning algorithms

validation in every new environment, growers take 80% dataset to train classifier and 20%

uptake has been limited. The Tompousse model was dataset is used to test classifier prediction

developed by (Abreu et al., 2000) to predict tomato accuracy. If classifier prediction accuracy high

yield in terms of the weight of harvested fruits. then Mean Square Error, Root Mean Square

The model was developed by examining the Error and Mean Absolute Error will be dropped.

relationship between environmental parameters in a 4) Run SVR Classifier: Using this module we will

heated greenhouses in the Southern part of France. train SVR classifier with splitted 80% data and

A linear relationship between flowering rate and used 20% data to calculate it performance

fruit growth was the basic assumption used in this 5) Run Random Forest Classifier: Using this

model. However, the model performance was poor module we will train Random Forest classifier

when tested in unheated plastic greenhouses in with splitted 80% data and used 20% data to

Portugal. Another tomato yield model was proposed calculate it performance

by Adams (Adams, 2002), based on a form of 6) Run LSTM Classifier: Using this module we

graphical simulation tool. The main objective of the will train LSTM classifier with splitted 80% data

model was to represent weekly fluctuations of and used 20% data to calculate it performance

greenhouse tomato yield in terms of fruit size and 7) Predict Plant & Yield Growth: Using this

harvest rate. Hourly climate data were used to module we will upload test data and then apply

estimate the rate of growth of leaf truss and the LSTM classifier to predict it growth value

flower production. Yield seasonal fluctuations were

generally infuenced by periodic variations of solar
radiation and air temperature. According to
(Qaddoum et al., 2013), there is a large number of

JETIR2206616 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org g110
© 2022 JETIR June 2022, Volume 9, Issue 6 www.jetir.org (ISSN-2349-5162)

4. RESULTS AND DISCUSSIONS:

Double click on ‘run.bat’ file to get below screen
Dataset information

To implement this project we are using FICUS plant

dataset and this dataset saved inside ‘dataset’ folder.
Below are some examples of dataset

CO2, Radiation, diameter, humidity,

outside_temperature, inside_temperature, Fig 1: In above screen click on ‘Upload Ficus Plant
measurement, Yield Dataset’ button and upload dataset
35.7, 20.85, 29.53, 0.91, 35.7, 27.48, 2.46, 35.7
35.1, 26.92, 29.77, 0.93, 35.1, 26.92, 2.83, 35.7
55.15, 25.42, 31.27, 0.67, 55.15, 31.8, 9.98, 45.6
54.87, 28.86, 32.39, 0.67, 54.87, 35.73, 9.97, 45.6
66.45, 34.7, 43.11, 0.75, 66.45, 39.12, 9.75, 13.1

In above dataset we have columns as CO2,

Fig 2: In above screen I am uploading ‘ficus.csv’
RADIATION, DIAMETER etc and last value is the
dataset file and after uploading dataset will get
YIELD of the crop under above environment values.
below screen
By using above values we will train classifier and
then upload test data to predict future growth or
yield. Below are some test environment values but
YIELD column is missing and classifier will predict

Fig 3: In above screen we can see dataset loaded and

dataset contains total 4028 records. Now click on
‘Dataset Preprocess, Clean & Train Test Split’
button to clean dataset and to split dataset into train
and test part
In above test data set we can see we have
environment values but yield/growth value is
missing and when we apply LSTM classifier on
above test data then it will predict future growth for
above test data.
JETIR2206616 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org g111
© 2022 JETIR June 2022, Volume 9, Issue 6 www.jetir.org (ISSN-2349-5162)

Fig 4: In above screen we can see application split

dataset into 80 and 20% and application using 3222 Fig 7: In above screen we can see LSTM got less
records for training and 806 for testing. Now dataset MSE, RMSE and MAE error compare to traditional
loaded and splitted and now click on ‘Run SVR algorithm. Now all algorithms training process
Algorithm’ button to train SVR algorithm completed and now we can upload test file and
predict its growth

Fig 5: In above screen we got RMSE, MAE and Fig 8: In above screen I am uploading ‘test.txt’ file
MSE error for SVR algorithm and now click on and now click on ‘Open’ button to predict growth
‘Run Random Forest Algorithm’ button to train for test data
random forest algorithm

Fig 9: In above screen for first record growth

prediction is 22% and second record 26% and third
Fig 6: In above screen we got random forest MSE,
record having 40% growth prediction. Similarly u
RMSE, MAE error and now click on ‘Run LSTM
can add new records to test data and can predict its
Algorithm’ button to train dataset with LSTM
growth. Now click on ‘MAE Graph’ button to see
algorithm
MAE comparison graph between all algorithms
JETIR2206616 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org g112
© 2022 JETIR June 2022, Volume 9, Issue 6 www.jetir.org (ISSN-2349-5162)

Fig 13: In above black screen we can see training

model generation for LSTM and to build this model
I am using 10 epoch and in each epoch LSTM will
use recent data to train model and forgot old data
reference.

5. CONCLUSION:

The paper developed a DL approach using LSTM

Fig 10: In above graph x-axis represents algorithm for Ficus growth (represented by the SDV) and
name and y-axis represents MAE error. From above tomato yield prediction, achieving high prediction
graph we can conclude that LSTM got less error and accuracy in both problems. Experimental results
its prediction performance will be best compare to were presented that show that the DL technique
other two. (using a LSTM model) outperformed other
traditional ML techniques, such as SVR and RF, in
terms of MSE, RMSE and MAE error criteria.
Hence, the main aim of our project is to develop DL
methodologies to predict plants growth and yield in
greenhouse environment. Future studies looking at
the continuity of : a) greatly increase the number of
collected data that are used for training the proposed
Fig 11: Below MSG error graph DL methods; b) extending the DL method so as to
perform multi-step (at a weekly, or a multiple of
weeks basis) prediction of growth and yield in a
large variety of greenhouse.

REFERENCES:

1. Abreu, P., Meneses, J. & Gary, C. 1998,

"Tompousse, a model of yield prediction for
Fig 12: Below RMSE graph tomato crops: calibration study for unheated
plastic greenhouses", XXV International
Horticultural Congress, Part 9: Computers and
Automation, Electronic Information in
Horticulture 519, pp. 141.
2. Adams, S. 2001, "Predicting the weekly
fluctuations in glasshouse tomato yields", IV
International Symposium on Models for Plant
JETIR2206616 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org g113
© 2022 JETIR June 2022, Volume 9, Issue 6 www.jetir.org (ISSN-2349-5162)

Growth and Control in Greenhouses: Modeling

for the 21st Century-Agronomic and 593, pp. 19.
3. Atanasova, N., Todorovski, L., Džeroski, S. &
Kompare, B. 2008, "Application of automated
model discovery from data and expert
knowledge to a real-world domain: Lake
Glumsø", Ecological Modelling, vol. 212, no. 1-
2, pp. 92-98.
4. Barandiaran, I. 1998, "The random subspace
method for constructing decision forests", IEEE
Transactions onPattern Analysis and Machine
Intelligence, vol. 20, no. 8..

JETIR2206616 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org g114