IoT-Enabled Distributed Data Processing For Precision Agriculture
IoT-Enabled Distributed Data Processing For Precision Agriculture
Grigore Stamatescu, Cristian Drăgana, Iulia Stamatescu, Loretta Ichim and Dan Popescu
Abstract— Large scale monitoring systems, enabled by the population. By having access to reliable, on-line information,
emergence of networked embedded sensing devices, offer the relayed over distributed networks, domain specialists can
opportunity of fine grained online spatio-temporal collection, oversee tangible improvements [2].
communication and analysis of physical parameters. Various
applications have been proposed and validated so far for envi- The conceptual and practical challenges that we approach
ronmental monitoring, security and industrial control systems. in the design of such systems is related to efficient data
One particular application domain has been shown suitable for reduction and management which impacts directly the con-
arXiv:1906.02678v1 [eess.SP] 3 Jun 2019
the requirements of precision agriculture where such systems gestion and energy metrics of the deployed network. This
can improve yields, increase efficiency and reduce input usage. is performed by proposing a hierarchical data processing
We present a data analysis and processing approach for
distributed monitoring of crops and soil where hierarchical ag- architecture in accordance to fog computing design princi-
gregation and modelling primitives contribute to the robustness ples. Fog computing as a concept has initially emerged as
of the network by alleviating communication bottlenecks and a computing organisation alternative to leverage intelligent
reducing the energy required for redundant data transmissions. network edge devices which make up modern IoT systems
The focus is on leveraging the fog computing paradigm to [3]. The limited computing resources available on these
exploit local node computing resources and generate events
towards upper decision systems. Key metrics are reported which embedded devices are thus exploited to reduce the large
highlight the improvements achieved. A case study is carried quantities of collected data and transmit only higher level
out on real field data for crop and soil monitoring with outlook information pieces upstream. Given the large heterogeneity
on operational and implementation constraints. the processing primitive can run of the edge nodes range
from basic threshold detection and averaging up to more
I. INTRODUCTION advanced outlier detection and embedded learning algo-
Internet of Things (IoT) systems are based on distributed rithms. Wireless sensor networks (WSN) are an enabling
sensing, computing and communication devices that collabo- technology to deploy fog computing systems [4], [5] where
rate in order to monitor and control physical processes. These hundreds to thousands of sensing nodes self organise intro
enable the collection of real world data at an unprecedented and communicate over low power radio channels. As with
scale and resolution which can then be used to improve the the case with agriculture, large areas can thus be covered
models that define the understanding and help the forecasting with multi-hop communication networks as the networking
of the processes, be it technical, social or environmental. protocols rely on cluster heads, gateways and hubs serving
New data processing infrastructure are thus needed to store as intermediary data concentrators. One alternative definition
and retrieve the information collected in an online manner presents fog systems in opposition or as complementary to
while providing mechanisms to run the analysis and control conventional centralised and large scale cloud infrastructures.
algorithms based on this data. Beyond conventional environ- The complex functionality of the cloud platform is broken
mental monitoring as initial key driver of IoT design, current down at the field level over functional or spatially distributed
domains include (smart) cities, industry and agriculture. entities which collaborate to achieve a common monitoring,
Finally the outcomes of the analysis are either handled in event-detection and control case. In the precision agriculture
closed loops for control actions or they are supplied to use case this can help implement an optimised distributed
hierarchical entities for decision support. irrigation or fertiliser dosage schemes accounting for local
Among the applications areas mentioned above, precision properties and variance of soil, micro-climate and crop
agriculture represents one of the salient areas where IoT- particularities. The need to integrate fog computing with
enabled systems can improve the quality, productivity and cloud computing in this particular scenario lays with the fact
increase automation [1]. Main challenges in this field relate that joint observations can be derived when federating high-
to reducing input use: water, fertiliser, work, and obtaining level information across multiple farms.
better crop yields which is demanded by the market to The main novelty of the paper is justified by the appli-
keep food costs low under the strains of increasing global cation of fog computing data aggregation and modelling
primitives in the context of IoT-enabled smart agriculture,
*This work has been funded by the Romanian Space Agency (ROSA), a highly active area of research currently. The subsequent
through the project ”Integrated Multi-Agent Aerial Robotic System for
Exploring Terrestrial Regions of Interest” (MAARS), contract no. 185/2017. contributions of the paper can be argued:
The authors are with the Department of Automatic Control and Indus- • system architecture for hierarchical data processing and
trial Informatics, University Politehnica of Bucharest, 060042 Bucharest, analysis based on field level IoT devices;
Romania. Grigore Stamatescu is also with the Institute of Techni-
cal Informatics, Technical University of Graz, 8010 Graz, Austria • data aggregation methodology based on the fog comput-
[email protected] ing paradigm under precision agriculture constraints.
II. RELATED WORK downstream data and control information flows. The layer
functionality is detailed next:
In [6] a fog computing framework for precision agriculture
• Field layer: includes the actual sensors deployed in the
is introduced. The two tiered system is able to reduce
significantly the data transmitted in the network. Reducing precision agriculture application to measure the physical
the computational loads, and most important, the cloud parameters of interest; these include air temperature,
computing costs associated with centralised processing is air humidity, solar radiation, soil temperature at various
highlighted as an essential benefit of the fog approach. The depths, windspeed and rainfall; the field layer can also
authors of [7] propose a hybrid IoT for smart farming in rural be expanded to accommodate intelligent actuators e.g.
areas. The communication network uses 6LoWPAN local for irrigation or fine grained nutrient dosage, to execute
radio for the field interfaces while long range connections commands incoming from higher level systems;
• Fog Computing layer: the fog nodes collect data from
are implemented over WiFi. A 6LoWPAN border router and
dedicated gateway are used to assure cross-domain integra- the sensors and run the data processing primitives for
tion of the networks from field level, intermediate long range intelligent aggregation in order to reduce network traffic
relays and cloud. Network requirements for smart agricul- and energy expenditure; the main idea is to locally
ture applications are also discussed in terms of throughput, derive basic model characteristics of the particular pro-
latency and mobility support. These offer a good reference to cess which are sent to the cloud in compact form;
quantify the data aggregation potential in conjunction with correlations between the sensed variables can also be
the sensing and control requirements. A distributed comput- exploited at this level for local decisions thus avoiding
ing architecture is presented in [8] which the agricultural completely the increased cost and latency of the upper
system basic components such as: crop, soil, climate, water layers;
• Cloud computing layer: data is streamed towards a
and nutrients, energy. The messaging system is standardised
around the Message Queuing Telemetry Transport (MQTT) common cloud platform; regarding the particular im-
to interlink sensors, actuators, communication nodes, devices plementation we use the ThingSpeak [13] platform in
and subsystems [9]. A decision tree is designed for irrigation conjunction with Matlab algorithm development for
control and integrated on the edge devices for in situ decision higher level processing routines; at the cloud layer the
making. At the top level cloud services supply data through model parameters allow the reconstruction of the time
an end-user dashboard for high level decision support. series characteristics if needed, while accounting for the
[10] introduce an intelligent irrigation system based on inherent modelling errors;
• Data presentation layer: is concerned with the front-end
distributed sensor using the LoRA long range, low rate,
nodes and gateways. The FIWARE infrastructure is leveraged software systems that present the outcomes of the data
as data management middleware platform which provides the analysis to end-users or decision makers with the ability
support services. Several operation scenarios are discussed to provide mobile access and timely alerts in the case
based on the scalability requirements, in terms of tens of event detection; parametrisation of the process by
of thousands of nodes. Reference computational resource domain experts is also achieved at this layer.
assessment for cpu, memory and network is also reported. A more detailed algorithm flowchart is provided in Figure
Large scale IoT monitoring is discussed in [11]. The focus 2. It includes the steps for algorithm description which runs
is on the ground level clustering mechanisms that support on the fog computing node.
the timely collection of data and generating of the field In-field measurements are uploaded to the IoT application
level monitoring events. Aerial robotic platform support is in two ways depending on the type of information: events
provided through suitable high level control of trajectories and measurements. Note that, a primary batching procedure
for data collection and backhaul. Data reduction is achieved is usually available for most of the monitoring systems,
by thresholding over locally computing moving averages in basically consisting of performing minimum, maximum and
conjunction with expert knowledge adapted to the monitored mean value during a specific period of time. We consider
processes. Several radio access technologies are available to this as the starting point for further local data processing.
achieve reliable transmissions [12]. Primary batch aggregation Note that, a primary batching
procedure is usually available for most of the monitoring
III. SYSTEM ARCHITECTURE AND systems, basically consisting of performing minimum, max-
METHODOLOGY imum and mean value during a specific period of time.
We consider this as the starting point for further local data
A. SYSTEM ARCHITECTURE FOR DATA COLLECTION
processing.
AND PROCESSING For instance, batches are defined within 30 minutes. Once
The proposed system architecture that we have designed a new batch is available, min, max and mean values are
for the purpose of efficient data collection and processing computed (step A).
in precision agriculture is illustrated in Figure 1. It consists Check for outliers procedure For each batch of mea-
of the following information and physical layers: field layer, surements, an outliers’ check procedure is performed, con-
fog computing layer, cloud computing layer, data presen- sidering an acceptance bandwidth of data variance for the
tation layer, which are linked by cross-layer upstream and measured value around the mean (step B). The procedure
Data
Interpretation Sensor
Data presentation
Monitoring station
Algorithm
Data Aggregation Development
Layer
and Analytics
Addon
API
Fog Computing
requests
Layer
Field Layer
Fig. 1: Distributed data processing based on fog computing for precision agriculture
outputs an event if the minimum or maximum values exceeds Then, if t(i) 6= t(i + 1) means that a trend change is
the thresholds. The event E is defined as: detected. The coresponding data point x( i + 1) is added to
the relevant data set.
E = {e(xi ) ∈ Q, Tmin < xi < Tmax } (1) Relevant data extraction (step C) is performed when a set
of primary aggregated batches is available.
where:
• xi is the measured value at iteration i B. DATA AGGREGATION
• Tmin and Tmax are thresholds computed as: One reference method of extracting high level information
Tmin = mean(1 − w) (2) from sensor data is Symbolic Aggregate Approximation
(SAX) [14]. It operates by assigning label symbols to
Tmax = mean(1 + w) (3) segments of the time series thus porting it in a unified
where w is a weight for acceptance bandwidth size lower dimension representation. It belongs to the family of
define. time series data mining techniques leading to non-parametric
modelling. Ranges are identified through the data histogram
Relevant data extraction Aggregated data sets are achieved
or in a uniform manner. The method provides linear complex-
based on different methods. All seek for relevant data point,
ity and opens up the use and application of multiple statistical
aiming to a reduced size set providing at the same time a
learning tools. Parametrisation of SAX is highly important
satisfying reconstruction of the initial data.
by defining the number of segments and the alphabet size
One effective method, in terms of data volume, is based
which can influence the quality and robustness of the result.
on using the min and max values extraction, computed for
The background on which SAX has been defined is
24 hours. It is obvious that this method is suitable only
established by PAA [15] where symbols are attributed to
for measurements that follow a regular shape during time,
the aggregated numerical values listed by PAA. Several
with insignificant variations during a day. A measurement
discrete event models can incorporate the resulting aggre-
for which this method is suitable is the soil temperature.
gated segments e.g. Markov models in order to compute the
Instead, change detection is a common method applicable
probability of the observed patterns for future observations.
for irregular shaped data sets. This method follows extraction
According to the PAA method description, starting with a
of data points where trend changes occur.
Given a set of data point (xi , yi ), i = 1, ..., n, trend ti is time series X of length n, this is approximated into a vector
followed for each pair xi , xi+1 , such that for X̄ = (x̄1 , ..., x̄M ) of any length M ≤ n, with n divisible by
M . Each element of the vector x̄i is calculated by:
xi+1 − xi > δ =⇒ t(i) = 1
(n/M )i
xi+1 − xi < δ =⇒ t(i) = −1 (4) M X
x̄i = xj (5)
xi+1 = xi =⇒ t(i) = 0 n
j=n/M (i−1)+1
Idle Given a set of data points (xi , yi ) , (xi+1 , yi+1 ) , ...,
(xn , yn ), the linear interpolation is defined as the concatena-
tion of linear interpolants between each pair of data points,
counter = 0
Time triggered thus a set of straight lines between each data points. Any
pair of data points with xi 6= xi+1 determines a unique
NO
polynomial p of degree less than two whose graph passes
through the two points with the property:
NO
New YES
Batch available?
counter == limit ? p(xi ) = yi (8)
with the form:
YES
p(x) = a1 x + a0 (9)
Extract relevant a 1-D linear interpolation.
Perform min, max, Store data
A C data from each
mean values counter++ In general, given n points (xi , yi ) , i = 1, ..., n, with
batch
disting xi , a polynomial of degree less than n whose graph
passes through the n points denoted Pn (x), is expressed in
B Check for outliers the Lagrange form as:
n n
!
X Y x − xj
Pn (x) = yi (10)
i=1
xi − xj
j=1
NO YES Send to IoT
Outlier/s found? OR j6=i
platform
The Lagrange form in (10) can be written out in power
form of an interpolating polynomial as,
Fig. 2: Fog Computing algorithm
Pn (x) = a1 xn−1 + a2 xn−2 + ... + an−1 x + an (11)
where the coefficients ak are computed through a system
The dimensionality of the time series is thus reduced from
of linear equations:
n to M samples by initially dividing the original data into n−1
x1n−2 ... x1
M equally sized frame and then compute the mean values x1 1 a1 y1
for each frame. A new sequence is achieved by putting xn−1 xn−2 ... x2 1 a2 y2
2 2
.. .. = .. (12)
the mean values together which is considered to be the . .. .. ..
. . . . . . . .
PAA transform (approximation) of the original data. With
xn−1
n xnn−2 ... xn 1 an yn
regard to computational considerations, the PAA transform
complexity can be reduced from O(N M ) to O(M m) with Considering this, a piecewise linear interpolant is produced
m being the number of frames as tuning parameter of by first computing the divided difference:
the method. The distance measure between two time series yi+1 − yi
vector approximations X̄ and Ȳ is defined as: δi := (13)
xi+1 − xi
Then the interpolant is constructed as:
v
r u M
nu X
t (x̄ − ȳ )
DP AA (X̄, Ȳ ) = i i (6)
M i=1 P (x) = yi + δi (x − xi ) (14)
It has been shown by the proposers of the method that Further, for piecewise cubic polynomials, considering an
PAA satisfies the lower bounding condition and guarantees interval xi ≤ x ≤ xi+1 let hi := xi+1 − xi be the length
no false dismissals such that: of an ith interval and dk := P 0 (xi ). Therefore, using this
derivative it is possible to adjust the interpolant in order to
DP AA (X̄, Ȳ ) ≤ D(X, Y ) (7) enforce smoothness, by forcing the pair of derivatives from
consecutive piecewice cubics to agree.
C. INTERPOLANT METHODS All piecewise cubic hermite interpolating polynomials
The Cloud-based application rebuilds data sets by esti- are continuous and have a continuous first derivative. In
mates based on interpolation mechanisms. For performance particular, spline is oddly smooth, meaning that it’s second
evaluation we showcase three methods: the common linear derative also varies continously.
interpolant (also referred as piecewise linear interpolant ) and Instead, pchip is not as smooth as spline, it is actually
two closely related interpolants, cubic spline and shape pre- designed so that it never overshoots the data. The slopes are
serving Piecewise Cubic Hermite Interpolating Polynomial chosen so that P (x) preserves the shape of data and also
(pchip). respects monotonicity.
Day 4
IV. EXPERIMENTAL RESULTS
We collect experimental data from a network of field 1.5
f
f
devices installed on site at an experimental research farm.
for analysis that covers one month of data. The data is 0.5 e
0.5 e
d
d
d
d
ing the variation of the data and is expressed as:
0
c
SSE
-0.5 R − square = 1 − (16)
SST
-1
a
where
-1.5
a
n
-2 X
SST = wi (xi − x̄i )2 (17)
-2.5
50 100 150 200 250 300 350 400 450 i=1
Sample
where x̄i is the mean value of xi dataset.
Fig. 3: Symbolic Aggregation Approximation - Soil Temper-
• Root mean square error (RMSE) - is an estimate of the
ature
standard deviation of the random component in the data
and is expressed as:
SAX Representation of IoT Agriculture Data - Solar Radiation r
SSE
RM SE = (18)
2 n
Results are summarised in Table I.
Solar Radiation (normalised)
1.5
ation 23
soil temperature [°C]
21
85 90 95 100 105 110
22.5
22
21.5
20
1000
800
Results:
RMSE pchip: 0.2627
FOG output samples
Fog output - pchip method
V. CONCLUSIONS
600
RMSE interp1q: 0.2551 Fog output - spline method
1500 RMSE sline: 0.3691 Fog output - interp1q method
400
200
Fog input - raw data The paper presented a system architecture and distributed
Solar radiation [W/m2]
0
1000 160 165 170 175 180 185 190
data processing application based on IoT in precision agri-
500 culture. By exploiting the dense spatial and temporal dis-
0
tributions of the sensing nodes, intelligent data reduction
through aggregation and model reconstruction is illustrated
-500
0 50 100 150 200 250
samples [1/30 minutes]
300 350 400 450 500
for significants benefits for network congestion and energy
efficiency. As the results achieved show promise, future work
Fig. 7: Solar Radiation
is focused on extensive evaluation for online decision making
Input/Output Solar Radiation Distribution by domain experts in order to improve the reconstructed data
300
Input
Output
quality.
250
R EFERENCES
200
[1] S. Heble, A. Kumar, K. V. V. D. Prasad, S. Samirana, P. Rajalakshmi,
and U. B. Desai, “A low power iot network for smart agriculture,”
Count
150
in 2018 IEEE 4th World Forum on Internet of Things (WF-IoT), Feb
2018, pp. 609–614.
100
[2] O. Elijah, T. A. Rahman, I. Orikumhi, C. Y. Leow, and M. N.
Hindia, “An overview of internet of things (iot) and data analytics in
50
agriculture: Benefits and challenges,” IEEE Internet of Things Journal,
vol. 5, no. 5, pp. 3758–3773, Oct 2018.
0
0 0.2 0.4 0.6 0.8 1 [3] M. R. Anawar, S. Wang, M. Azam Zia, A. K. Jadoon, U. Akram,
Solar Radiation (normalized) and S. Raza, “Fog computing: An overview of big iot data analytics,”
Wireless Communications and Mobile Computing, vol. 2018, 2018.
Fig. 8: Solar Radiation - Histogram [4] V. Mihai, C. Dragana, G. Stamatescu, D. Popescu, and L. Ichim,
“Wireless sensor network architecture based on fog computing,” in
2018 5th International Conference on Control, Decision and Informa-
tion Technologies (CoDIT), April 2018, pp. 743–747.
Figure 6 and Figure 7 graphically depict the results of [5] V. Mihai, C. E. Hanganu, G. Stamatescu, and D. Popescu, “Wsn and
applying the alternative methods of interpolation on the two fog computing integration for intelligent data processing,” in 2018 10th
time series. In Figures 8 and 9 the histograms quantify the International Conference on Electronics, Computers and Artificial
Intelligence (ECAI), June 2018, pp. 1–4.
associated data reduction between the raw input data and the [6] E. Guardo, A. Di Stefano, A. La Corte, M. Sapienza, and M. Scatà, “A
interpolant methods presented. fog computing-based iot framework for precision agriculture,” Journal
For this case, the monotonicity property of pchip is more of Internet Technology, vol. 19, no. 5, pp. 1401–1411, 2018.
[7] N. Ahmed, D. De, and I. Hussain, “Internet of things (iot) for smart
desirable than the smoothness property of spline, which in precision agriculture and farming in rural areas,” IEEE Internet of
some places overshoots the data, thus one may prefer the Things Journal, vol. 5, no. 6, pp. 4890–4899, Dec 2018.
good behavior of the shape preserving pchip method. Note [8] F. J. Ferrández-Pastor, J. M. Garcı́a-Chamizo, M. Nieto-Hidalgo,
and J. Mora-Martı́nez, “Precision agriculture design method using
that, as with the linear interpolation, when there are two a distributed computing architecture on internet of things context,”
consecutive points with the same value, the interpolant is Sensors, vol. 18, no. 6, p. 1731, 2018.
constant over that interval. This behaviour was expected and [9] T. Yokotani and Y. Sasaki, “Transfer protocols of tiny data blocks in
iot and their performance evaluation,” in 2016 IEEE 3rd World Forum
it is appropriate in this context. on Internet of Things (WF-IoT), Dec 2016, pp. 54–57.
Even if the metrics indicate better fitting for linear inter- [10] C. Kamienski, J.-P. Soininen, M. Taumberger, R. Dantas, A. Toscano,
polation through the studied cases, one can choose the pchip T. Salmon Cinotti, R. Filev Maia, and A. Torre Neto, “Smart water
management platform: Iot-based precision irrigation for agriculture,”
method, given that the results are quite close and it does a Sensors, vol. 19, no. 2, p. 276, 2019.
much more visual pleasing representation, in particular better [11] D. Popescu, C. Dragana, F. Stoican, L. Ichim, and G. Stamatescu, “A
modelling the peeks and following the expected behaviour collaborative uav-wsn network for monitoring large areas,” Sensors,
vol. 18, no. 12, p. 4202, 2018.
around the baseline. [12] G. Stamatescu, D. Popescu, and R. Dobrescu, “Cognitive radio as
solution for ground-aerial surveillance through wsn and uav infras-
Input/Output Soil Temperature Distribution
tructure,” in Proceedings of the 2014 6th International Conference on
100 Electronics, Computers and Artificial Intelligence (ECAI), Oct 2014,
90 Input
Output
pp. 51–56.
80
[13] M. Maureira, D. Oldenhof, and L. Teernstra, “Thingspeak—an api and
web service for the internet of things.” 2011, available online.
70
[14] E. Keogh, J. Lin, and A. Fu, “Hot sax: Efficiently finding the
60
most unusual time series subsequence,” in Proceedings of the Fifth
Count