0% found this document useful (0 votes)

6 views19 pages

Electronics 12 01007

This research focuses on forecasting meteorological variables using various machine learning techniques implemented in Python, including random forest and XGBoost, which showed superior performance for most variables. The study evaluates models for temperature, relative humidity, solar radiation, and wind speed, utilizing metrics such as RMSE and R2 for assessment. The findings aim to support applications in agriculture and renewable energy management by providing a low-cost forecasting system based on open-source software.

Uploaded by

Francisco Carrillo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views19 pages

Electronics 12 01007

Uploaded by

Francisco Carrillo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

electronics

Article
Meteorological Variables Forecasting System Using Machine
Learning and Open-Source Software
Jenny Aracely Segovia *, Jonathan Fernando Toaquiza *, Jacqueline Rosario Llanos *
and David Raimundo Rivas

Department of Electrical and Electronic Engineering, Universidad de las Fuerzas Armadas (ESPE),
Sangolquí 171103, Ecuador
* Correspondence: [email protected] (J.A.S.); [email protected] (J.F.T.); [email protected] (J.R.L.)

Abstract: The techniques for forecasting meteorological variables are highly studied since prior
knowledge of them allows for the efficient management of renewable energies, and also for other
applications of science such as agriculture, health, engineering, energy, etc. In this research, the
design, implementation, and comparison of forecasting models for meteorological variables have
been performed using different Machine Learning techniques as part of Python open-source software.
The techniques implemented include multiple linear regression, polynomial regression, random
forest, decision tree, XGBoost, and multilayer perceptron neural network (MLP). To identify the
best technique, the mean square error (RMSE), mean absolute percentage error (MAPE), mean
absolute error (MAE), and coefficient of determination (R2 ) are used as evaluation metrics. The most
efficient techniques depend on the variable to be forecasting, however, it is noted that for most of
them, random forest and XGBoost techniques present better performance. For temperature, the best
performing technique was Random Forest with an R2 of 0.8631, MAE of 0.4728 ◦ C, MAPE of 2.73%,
and RMSE of 0.6621 ◦ C; for relative humidity, was Random Forest with an R2 of 0.8583, MAE of
2.1380RH, MAPE of 2.50% and RMSE of 2.9003 RH; for solar radiation, was Random Forest with
an R2 of 0.7333, MAE of 65.8105 W/m2 , and RMSE of 105.9141 W/m2 ; and for wind speed, was
Random Forest with an R2 of 0.3660, MAE of 0.1097 m/s, and RMSE of 0.2136 m/s.

Keywords: machine learning; forecasting models; meteorological variables; Python

Citation: Segovia, J.A.; Toaquiza, J.F.;
Llanos, J.R.; Rivas, D.R.
Meteorological Variables Forecasting
System Using Machine Learning and
1. Introduction
Open-Source Software. Electronics
2023, 12, 1007. https://doi.org/ Since the 17th century, meteorological variables have been of great interest throughout
10.3390/electronics12041007 history, with the creation of the first instruments for measuring meteorological variables
aiming to accurately predict the weather. For this purpose, mathematical and statistical
Academic Editor: Grzegorz Dudek
methods and computer programs are used, most of which are of a non-linear nature [1].
Received: 4 January 2023 Nowadays, climatic conditions change under various influences. For example, atmospheric
Revised: 4 February 2023 pollution is increasing, so climate change is occurring and threatening the planet [2], which
Accepted: 6 February 2023 is why the measurement of meteorological variables has grown in importance as the
Published: 17 February 2023 information provided by the meteorological stations is important for monitoring climate
change [3].
Climate is defined by the grouping of meteorological phenomena that are related to
each other; although each of them is studied separately, it must be taken into account that a
Copyright: © 2023 by the authors.
change in one produces a variation in the others [4]. The actual weather is characterised
Licensee MDPI, Basel, Switzerland.
by the wind, temperature, and humidity variables forced by radiative fluxes and surface
This article is an open access article
latent and sensible heat fluxes. The local climate usually denotes the mean state of the
distributed under the terms and
atmosphere over a 20–30-year period for a given location and day (or season) of the year.
conditions of the Creative Commons
Attribution (CC BY) license (https://
For this reason, meteorological variables are usually modeled by means of computational,
creativecommons.org/licenses/by/
numerical, and statistical techniques, most of which are nonlinear [5]. Forecasting certain
4.0/).
climatic variables is a great challenge due to the variable behavior of the climate, which

Electronics 2023, 12, 1007. https://doi.org/10.3390/electronics12041007 https://www.mdpi.com/journal/electronics

Electronics 2023, 12, 1007 2 of 19

makes it impossible to optimally manage renewable energies and obtain a greater benefit
from them.
There are multiple scientific studies of modeling and prediction in order to forecast
future conditions of phenomena in various fields; among the most prominent are ARIMA,
Chaos Theory, and Neural Networks [6]. Forecasting models have evolved in recent
decades, from smart systems with formal rules and logical theories, to the emergence of
artificial intelligence techniques that allow us to propose alternatives in the treatment of
information [7].
Currently, forecasting models have a high impact and are used for several applica-
tions, such as management of energy units for renewable resources microgrids [8,9], load
estimation methods for isolated communities that do not receive energy or only receive it
for a limited time each day [10,11], the operation of energy systems [12,13], in agriculture
to predict the water consumption of plants and plan the irrigation sheet [14], in agriculture
4.0 for the prediction of variables that affect the quality of crops, for micronutrient analysis
and prediction of soil chemical parameters [15], optimization of agricultural procedures
and increasing productivity in the field, forecasting of SPI and Meteorological Drought
Based on the Artificial Neural Network and M5P Model Tree [16], and in controllers based
on forecasting models and predictive controllers. They are also used in the health field to
predict the solar radiation index and to obtain a correct assessment in people with skin
cancer [17], therefore, all the applications mentioned above need forecasting models that
have the lowest error rate for their effective operation.
Having a forecasting model system is costly because computer packages are used
in which licensing costs can be significant. On the other hand, free software is an option
to reduce costs. This research proposes a system based on free software (Python), which
is currently used at industrial level for its reliability, for example in applications such as
the following: Advanced Time Series: Application of Neural Networks for Time Series
Forecasting [18], Machine Learning in Python: main developments and technological
trends in data science, Machine Learning and artificial intelligence [19], Development of
an smart tool focused on artificial vision and neural networks for weed recognition in rice
plantations, using Python programming language [20], etc.
In this research, different prediction techniques were evaluated and compared—among
them, multiple linear regression, polynomial regression, random forest, decision tree, XG-
Boost, and multilayer perceptron neural network—in order to identify the best performing
strategy, using evaluation metrics such as the root mean square error (RMSE) and the
coefficient of determination (R2 ). The variables to be predicted are temperature, relative
humidity, solar radiation, and wind speed, from data taken from the weather station located
in Ecuador, Tungurahua province, Baños. The predicted variables will be the inputs for a
smart irrigation system and used for an energy management system of a microgrid based
on predictive control, therefore, models with high approximation to online measurements
are required.
The contributions of this work are as follows: (i) To design, validate, and compare
different machine learning techniques, and with them select the best technique that adapts
to climate variables for agriculture and energy applications, (ii) To develop a forecast system
for climate variables of low cost based in free software (Python), (iii) To generate forecasting
models that can be replicated for other types of variables applied to smart control systems
based on forecasting models.

2. Design of Forecasting Models for Meteorological Variables

This section describes the prediction techniques used and their design. In this research,
the following meteorological variables are studied and predicted: temperature, relative
humidity, wind speed, and solar radiation.
The techniques designed, evaluated, and compared are the following: multiple linear
regression, polynomial regression, random forest, decision tree, XGBoost, and neural
Electronics2023,
Electronics 2023,12,
12,1007
x FOR PEER REVIEW 33 of
of1920

network—multilayerperceptron.
network—multilayer perceptron. To
Toobtain
obtainthetheforecast
forecastofofmeteorological
meteorologicalvariables,
variables,the
the
designmethodology
design methodologyshown
shownininFigure
Figure11isisimplemented.
implemented.

Figure1.1. Flowchart
Figure Flowchart of
ofthe
themethodology
methodologyused to to
used obtain forecasting
obtain models
forecasting for meteorological
models vari-
for meteorological
ables.
variables.

2.1.Obtaining
2.1. Obtainingthe theDatabase
Database
Forthe
For theimplementation
implementationofofthe theforecasting
forecastingmodels,
models,information
informationwas wasobtained
obtainedfrom
from
thepage
the pageofof the
the Tungurahua
Tungurahua hydrometeorological network, network, where
wherethere thereareareseveral
severalmete-
me-
orological stations,
teorological stations,including
including the
the Baños
Baños family park, located in in Ecuador,
Ecuador,Tungurahua
Tungurahua
province,Baños,
province, coordinatesX𝑋==9,9,845,439,
Baños,coordinates 845, 439, Y 𝑌== 791,471 that counts
791, 471 that counts the
theparameters
parametersofof
precipitation
precipitation(mm (mm), temperature(◦(°C),
), temperature C ) , relative
relative humidity
humidity ((%),
% ) , wind
wind speed
speed (m/s),
(m/s ), wind
wind
◦
direction( (°), (W/m, ),
2
(mm).

direction ), solar radiation
solar radiationW/m and evapotranspiration
and evapotranspiration (mm ). ForFor
the design of theof
the design
models, only only
the models, the values of temperature,
the values of temperature,solar radiation, relativerelative
solar radiation, humidity, and wind
humidity, andspeed
wind
were
speed taken,
weresince after
taken, a previous
since analysis of
after a previous correlation
analysis between meteorological
of correlation variables,
between meteorological
the variables
variables, thewith lower correlation
variables with lowerwith the variable
correlation withtothe
be predicted
variable to arebediscarded.
predictedItare
is
Electronics 2023, 12, 1007 4 of 19

important to note that the values of temperature, solar radiation (net solar radiation at
surface), and relative humidity were measured at a distance of 2 m, while the wind speed
was measured at 10 m.

2.2. Data Preprocessing

From the database obtained, 1 year of information was available (from 23 July 2021
to 15 June 2022), which was preprocessed to take data every 5 min for each variable
(temperature, relative humidity, wind speed, and solar radiation). To make a forecast, it is
important to verify that there are no missing data in the measurements or to implement a
data filling method; in this case, a Python algorithm was implemented, which calculates
the average of the existing list of data and automatically fills in the missing data.

2.3. Dataset Division

To verify that the models work correctly, the available database is divided into
three groups: training set, test set, and validation set. As its name indicates, the first
one will be used to train the forecasting models, the second one will be used to evaluate
the test set, and the third one to validate each of the implemented models [17,21].
After data preprocessing, a total of 93,780 data were obtained for each variable, where
80% of the database (75,024 data) is used to train the models, 20% (18,756 data) to test the
models, and 2 days (576 data) were used for the validation of the models.

2.4. Design of the Forecasting Models

2.4.1. Multiple Linear Regression
It is a technique that allows modeling the relationship between a continuous variable
and one or more independent variables by adjusting a linear equation. It is called simple
linear regression when there is one independent variable, and if there is more than one,
it is called multiple linear regression. In this context, the modeled variables are called
dependent or response variables (y); and the independent variables are called regressors,
predictors, or features ( X ) [22]. Multiple linear regression is defined by Equation (1)

y = a + b1 X1 + b2 X2 + · · · + bn Xn (1)

where: X1 , X2 , . . . Xn : are the predictor or independent variables, b1 , b2 , . . . bn : coefficients

of the predictor variables, a : constant of the relationship between the dependent and
independent variable, and y : predicted or dependent variable.
After performing different heuristic tests and using sensitivity analysis for this fore-
casting technique, it is deduced that the best parameters for tuning are those described
in Table 1.

Table 1. Tuning parameters for the multiple linear regression techniques.

Multiple Linear Regression

Predicted Variable Inputs Variables
Temperature Solar radiation, relative humidity, wind speed
Solar radiation Temperature, relative humidity, wind speed
Wind speed Temperature, solar radiation, relative humidity
Relative Humidity Temperature, solar radiation, wind speed

2.4.2. Polynomial Regression

A linear regression with polynomial attributes that uses the relationship between the
dependent (y) and independent ( X ) variables to find the best way to draw a line through
the data points. This technique is used when the data are more complex than a simple
straight line [23], and is defined by Equation (2).

y = a + b1 Xi + b2 Xi 2 + b3 Xi 3 + . . . + bn Xi n (2)
Electronics 2023, 12, 1007 5 of 19

where: X1 , X2 , . . . Xn : are the predictor or independent variables, b1 , b2 , . . . bn : coefficients

Table 2. Tuning parameters for polynomial regression technique.

Polynomial Regression
Degree of the
Predicted Variable Inputs Variables
Polynomial
Temperature Solar radiation, relative humidity, wind speed 4
Solar radiation Temperature, relative humidity, wind speed 5
Wind speed Temperature, solar radiation, relative humidity 6
Relative Humidity Temperature, solar radiation, wind speed 4

2.4.3. Decision Tree

Values by learning decision rules derived from features and can be used for classifica-
tion, regression, and multi-output tasks. Decision trees work by dividing the feature space
into several simple rectangular regions, divided by parallel divisions of axes. To obtain
a prediction, the mean or mode of the responses of the training observations, within the
partition to which the new observation belongs, is used [23]. This is defined by Equation (3).
m
Gi = 1 − ∑ ( Pi,k )2 (3)
k =1

where: Pi,k : is the radio of class k instances among the training instances in the ith node,
m : number of class labels, and Gi (Gini impurity): represents the measure for constructing
decision trees.
After performing different heuristic tests and using sensitivity analysis for this forecast
technique, it is deduced that the best parameters for tuning are those described in Table 3.

Table 3. Tuning parameters for the decision tree technique.

Decision Tree
Predicted Variable Inputs Variables Max_Depth Min_Samples_Leaf
Temperature Solar radiation, relative humidity, wind speed 10 18
Solar radiation Temperature, relative humidity, wind speed 10 7
Wind speed Temperature, solar radiation, relative humidity 19 6
Relative Humidity Temperature, solar radiation, wind speed 9 16

2.4.4. Random Forest

A supervised learning algorithm that uses an ensemble learning method for regression
that combines predictions from several machine learning algorithms (decision trees) to
make a more accurate prediction than a single model [23]. Figure 2 shows that the random
forest algorithm is composed of a collection of decision trees, and each tree in the set is
composed of a sample of data extracted from a training set (DATASET); for a regression task,
the individual decision trees are averaged (Average) until the predicted value (Prediction)
is obtained.
In general, deep decision trees tend to overfit, while random forests avoid this by
generating random subsets of features and using those subsets to build smaller trees.
The generalization error for random forests is based on the strength of the individual
constructed trees and their correlation [24].
Electronics 2023,12,
Electronics2023, 12,1007
x FOR PEER REVIEW 66 of
of 19
20

Figure2.2.Algorithm
Figure Algorithmfor
formaking
makingpredictions
predictionsusing
usingrandom
randomforest.
forest.

In general,
This technique deep
hasdecision trees tend to
several parameters overfit,
that can bewhile random
configured, forests
such as theavoid this by
following:
N◦ estimators:
generating the number
random subsets of trees
of features andinusing
the forest. Max leaf
those subsets nodes:smaller
to build the maximum
trees. The
number of leaf nodes,
generalization thisrandom
error for hyperparameter
forests issets a condition
based for splitting
on the strength the individual
of the tree nodes andcon-
thus restricts
structed treesthe growth
and their of the tree. If[24].
correlation after splitting there are more terminal nodes than the
specified
Thisnumber,
technique thehas
splitting
several stops and the tree
parameters thatdoes notconfigured,
can be continue tosuchgrow,aswhich helps to
the following:
avoidN°overfitting.
estimators: AndtheMax features:
number the in
of trees maximum
the forest.number
Max leafof features
nodes: the that are evaluated
maximum num-
for
bersplitting at eachthis
of leaf nodes, node, increasing max_features
hyperparameter generally
sets a condition improves
for splitting themodel performance,
tree nodes and thus
since eachthe
restricts node now has
growth a greater
of the tree. Ifnumber of options
after splitting to are
there consider
more [23].
terminal nodes than the
After number,
specified performing thedifferent
splittingheuristic
stops and tests
theand
treeusing
doessensitivity
not continueanalysis for this
to grow, whichforecast
helps
technique, it is deduced that the best parameters for tuning are those
to avoid overfitting. And Max features: the maximum number of features that are evalu- described in Table 4.
ated for splitting at each node, increasing max_features generally improves model perfor-
Table 4. Tuning
mance, since eachparameters
node now for the
hasrandom forest
a greater technique.
number of options to consider [23].
After performing different heuristic tests and using sensitivity analysis for this fore-
Random Forest
cast technique, it is deduced that the best parameters for tuning are those described in
Predicted Variable TableInputs
4. Variables N◦ Estimators Max Leaf Nodes Max Features
Temperature Solar radiation, relative humidity, wind speed 100 3000 0.1
Table 4. Tuning parameters for the random forest technique.
Solar radiation Temperature, relative humidity, wind speed 100 3000 0.1
Wind speed Random
Temperature, solar radiation, relative Forest
humidity 100 2000 0.3
Predicted Variable
Relative Humidity Inputs Variables
Temperature, solar radiation, wind speed N° Estimators
100 Max Leaf
2000Nodes Max Features
0.2
Solar radiation, relative humidity,
Temperature 100 3000 0.1
wind speed
2.4.5. Extreme Gradient Boosting (XGboost)
Temperature, relative humidity,
Solar radiation The XGBoost algorithm is a scalable100 3000that can be used0.1
tree-boosting system for both
wind speed
classification and regression tasks. It performs a second-order Taylor expansion on the
Temperature,
loss function solar
andradiation, rela-
can automatically use100
multiple threads of the central processing
Wind speed 2000 0.3 unit
tive humidity
(CPU) for parallel computing. In addition, XGBoost uses a variety of methods to avoid
Temperature, solar radiation, wind
overfitting [25].
Relative Humidity 100 2000 0.2
Figure 3 speed
shows the XGBoost algorithm; decision trees are created sequentially (De-
cision Tree-1, Decision Tree-2, Decision Tree-N) and weights play an important role in
2.4.5. Extreme
XGBoost. Gradient
Weights Boosting
are assigned to (XGboost)
all independent variables, which are then entered into
the decision tree that predicts the outcomes (Result-1, Result-2,
The XGBoost algorithm is a scalable tree-boosting systemResult-N). The
that can be weights
used of
for both
variables incorrectly predicted by the tree are increased and these variables
classification and regression tasks. It performs a second-order Taylor expansion on the are then fed
into
lossthe secondand
function decision tree (Residual error).
can automatically These individual
use multiple threads ofpredictors areprocessing
the central then groupedunit
(Average) to give a strong and more accurate model (Prediction).
Figure 3 shows the XGBoost algorithm; decision trees are created sequentially (Deci-
sion Tree-1, Decision Tree-2, Decision Tree-N) and weights play an important role in
XGBoost. Weights are assigned to all independent variables, which are then entered into
the decision tree that predicts the outcomes (Result-1, Result-2, Result-N). The weights of
Electronics 2023, 12, 1007
variables incorrectly predicted by the tree are increased and these variables are then fed
7 of 19
into the second decision tree (Residual error). These individual predictors are then
grouped (Average) to give a strong and more accurate model (Prediction).

Figure3.3.Structure
Figure StructureofofananXGBoost
XGBoostalgorithm
algorithmfor
forregression.
regression.

After
Afterperforming
performing different heuristic
different tests
heuristic andand
tests using sensitivity
using analysis
sensitivity for this
analysis forecast
for this fore-
technique, it is deduced
cast technique, that thethat
it is deduced bestthe
parameters for its tuning
best parameters for itsare thoseare
tuning described in Table 5.
those described in
Table 5.
Table 5. Tuning parameters for the XGboost technique.
Table 5. Tuning parameters for the XGboost technique.
XGBoost
Predicted Variable Inputs Variables XGBoost Max Depth N◦ Estimators
Temperature SolarPredicted Variable
radiation, relative Inputs
humidity, wind speedVariables 2 Max Depth N° 100Estimators
Solar radiation, relative humidity,
Solar radiation
Temperaturerelative humidity,
Temperature,
2
2 20
100
wind speed wind speed
Temperature, relative humidity,
Temperature,
Solar radiationsolar radiation, 2
Wind speed wind speed 5 19 20
relative humidity
Temperature, Temperature, solar radiation,
Relative Humidity Wind speed solar radiation, 7 5 19 19
wind speed relative humidity
Temperature, solar radiation,
Relative Humidity 7 19
2.4.6. Neural Network—Multilayerwind speed
Perceptron
It is an effective and widely used model for modeling many real situations. The
2.4.6. Neural Network—Multilayer Perceptron
multilayer perceptron is a hierarchical structure consisting of several layers of fully inter-
connectedIt is an effective
neurons, and input
which widely used model
neurons for modeling
are outputs many real
of the previous situations.
layer. Figure 4The mul-
shows
tilayer perceptron is a hierarchical structure consisting of several layers
the structure of a multilayer perceptron neural network; the input layer is made up of r of fully intercon-
nected
units neurons,
(where which
r is the numberinputof neurons
external are outputs
inputs) of the previous
that merely distributelayer. Figure
the input 4 shows
signals to
thenext
the structure
layer; of
theahidden
multilayer layerperceptron
is made upneural network;
of neurons thatthe
haveinput layer is made
no physical of 𝑟
contactupwith
units
the (where
outside; the𝑟number
is the number
of hidden of layers
external inputs) that
is variable merely
(u); and the distribute
output layer theisinput
madesignals
up of
l to the next
neurons layer;l is
(where thethehidden
number layer is madeoutputs)
of external up of neurons that have
whose outputs no physical
constitute contact
the vector
of external outputs of the multilayer perceptron [26].
The training of the neural network consists of calculating the linear combination
from a set of input variables, with a bias term, applying an activation function, generally
the threshold or sign function, giving rise to the network output. Thus, the weights
of the network are adjusted by the method of supervised learning by error correction
(backpropagation), in such a way that the expected output is compared with the value of
the output variable to be obtained, the difference being the error or residual. Each neuron
behaves independently of the others: each neuron receives a set of input values (an input
vector), calculates the scalar product of this vector and the vector of weights, adds its own
bias, applies an activation function to the result, and returns the final result obtained [26].
Electronics 2023, 12, x FOR PEER REVIEW 8

with the outside; the number of hidden layers is variable (𝑢); and the output layer is
Electronics 2023, 12, 1007 up of 𝑙 neurons (where 𝑙 is the number of external outputs) whose 8outputs
of 19 cons
the vector of external outputs of the multilayer perceptron [26].

Figure 4. Structure of a4.multilayer

Figure Structure perceptron neural
of a multilayer network.neural network.
perceptron

In general, all weights and biases will be different. The output of the multilayer
The training of the neural network consists of calculating the linear combination
perceptron neural network is defined by Equation (4). Where: yk is the output, f k activation
a set of input variables, with a bias term, applying an activation function, generall
function of output layer, θk0 bias of the output layer, Wij hidden layer weights, y0j output of
threshold or sign function, giving rise to the network output. Thus, the weights o
the hidden layer, f j0 activation function of the hidden layer, Xi neuron inputs, Wjk0 output
network are adjusted by the method of supervised learning by error correction (back
layer weights,agation),
θ j bias ofinhidden
such a layer,
way that r is the
the expected
number of inputs
output is for withj the
the neuron
compared from the of the o
value
hidden layer, and u is the number of inputs for the neuron k from the output layer
variable to be obtained, the difference being the error or residual. Each neuron beh [27].
independently of the others: each neuronreceives a set of input values (an input ve
r
0 0
= f j ∑ofXthis
y j product − θ j and the vector of weights, adds its own
calculates the scalar i Wij vector
i =1
applies an activation functionu to the result, ! and returns the final result obtained
(4) [26]
0 0 0
In general, all yweights
k = f k and ∑ ybiases
j Wjk − θk
will be different. The output of the multilaye
ceptron neural network is defined j =1 by equation (4). Where: 𝑦 is the output, 𝑓 activ
function of output layer, 𝜃 bias of the output layer, 𝑊 hidden layer weights, 𝑦 o
For this research, backpropagation was used as a training technique. After performing
of the hidden layer, 𝑓 activation function of the hidden layer, 𝑋 neuron inputs, 𝑊
different heuristic tests and using sensitivity analysis for this forecasting technique, it is
put best
deduced that the weights, 𝜃for
layerparameters bias
itsof hidden
tuning those𝑟described
arelayer, is the number of inputs
in Table 6. for the neuron 𝑗
the hidden layer, and 𝑢 is the number of inputs for the neuron 𝑘 from the output
Table 6. Tuning[27].
parameters for the multilayer perceptron neural network technique.

Neural Network—Multilayer Perceptron𝑦 = 𝑓 𝑋𝑊 −𝜃

Predicted Input Layer Batch Hidden Layer Activation
Inputs Variables N◦ Epoch
Variable Neurons Size Neurons Function

Temperature
Solar radiation, relative
3 5000 𝑦 =𝑓
128 𝑦 32
𝑊 −𝜃 Hidden: ReLU
humidity, wind speed Out: Sigmoid
Temperature, relative Hidden: ReLU
Solar radiation 3 research,5000
For this 128
backpropagation was used32as a training technique.
humidity, wind speed Out: Sigmoid After perf
ing different heuristic tests and using sensitivity analysis for this forecasting techniq
Temperature, solar radiation, Hidden: ReLU
Wind speed is deduced3 that the best
3000parameters128for its tuning32
are those described in Table 6.
relative humidity Out: Sigmoid
Relative Temperature, solar radiation,
Table 6. Tuning parameters Hidden: ReLU
3 5000 for the multilayer
128 perceptron
32 neural network technique.
Humidity wind speed Out: Sigmoid
Electronics 2023, 12, 1007 9 of 19

3. Results
3.1. Indicators for Assessing the Performance of Weather Forecasting Models
To measure the performance of the forecast techniques for each of the variables de-
scribed above, two types of metrics were used: to evaluate the forecast accuracy, the mean
square error RMSE is used, which allows comparing their results and defining the technique
with the lowest error, and therefore, the best method for each variable to be predicted. In
addition, to determine if the implemented models perform well in their training and to
define their predictive ability, the coefficient of determination is R2 .

3.1.1. Coefficient of Determination (R2 )

R2 or coefficient of determination can be in the range of [−∞, 1] it is used to determine
the ability of a model to predict future results. The best possible result is 1, and occurs when
the prediction coincides with the values of the target variable, while the closer to zero, the
less well-fitted the model is and, therefore, the less reliable it is. R2 can take negative values
because the prediction can be arbitrarily bad [28]. It is defined as Equation (5), described
by 1 minus the sum of total squares divided by the sum of squares of the residuals.
2
∑ (yc − ŷc )
R2 = 1 − 2
(5)
∑ (yc − y)

where: yc : are the values taken by the target variable, ŷc : are the values of the prediction,
and y: is the mean value of the values taken by the target variable.

3.1.2. Mean Square Error (RMSE)

The root mean square error, also known as root mean square deviation, measures the
amount of error between two sets of data. That is, it compares the predicted value with the
observed or known value [28]. It is given by Equation (6):
s
1 o
o c∑
RMSE = (yc − ŷc )2 (6)
=1

where: yc : are the values taken by the target variable, ŷc : are the values of the prediction,
and o: is the sample size.

3.1.3. Mean Absolute Percentage Error (MAPE)

Mean absolute percentage error is an evaluation metric for regression problems; the
idea of this metric is to be sensitive to relative errors. MAPE is the mean of all absolute
percentage errors between the predicted and actual values [29]. It is given by Equation (7):

1 o yc − ŷc
o c∑
MAPE = ∗ 100% (7)
=1 yc

where yc : are the values taken by the target variable, ŷc : are the values of the prediction,
and o: is the sample size.
Equation (7) helps to understand one of the important caveats when using MAPE,
since to calculate this metric, you need to divide the difference by the actual value. This
means that if you have actual values close to 0 or at 0, the MAPE score will receive a
division error by 0 or will be extremely high. Therefore, it is recommended not to use
MAPE when it has real values close to 0 [30].

3.1.4. Mean Absolute Error (MAE)

Mean absolute error is a common metric to use for measuring the error of regression
predictions. The mean absolute error of a model is the mean of the absolute values of the
individual prediction errors on over all instances in the test set. Each prediction error is
Electronics 2023, 12, 1007 10 of 19

the difference between the true value and the predicted value for the instance [16,31]. It is
given by Equation (8):
1 o
o c∑
MAE = | yc − ŷc | (8)
=1
where: yc : are the values taken by the target variable, ŷc : are the values of the prediction,
and o: is the sample size.

3.2. Case Study

For the implementation of the forecast techniques for meteorological variables (tem-
perature, wind speed, solar radiation, and relative humidity), the Python programming
language was used. Information was obtained from the Parque de la Familia Baños
meteorological station, located in Ecuador, Tungurahua province, Baños, coordinates
X = 9, 845, 439, Y = 791, 471. From the database obtained, 1 year of information was
available (from 23 July 2021 to 15 June 2022) with a sampling time of 5 min having a total
of 93,780 data for each variable, where 80% of the database (75,024 data) is used to test the
models, 20% (18,756 data) to test the models, and 2 days (576 data) were used for validation.
To obtain the values of the evaluation metrics (RMSE, MAE, MAPE y R2 ) the validation
data corresponding to the days 10 June 2022 and 11 June 2022 were used.
The forecast techniques implemented for all variables are the following: multiple linear
regression, polynomial regression, decision tree, random forest, XGboost, and multilayer
perceptron neural network.
To identify which of the models is more efficient, evaluation metrics such as root mean
square error (RMSE), mean absolute percentage error (MAPE), and mean absolute error
(MAE) are used over the entire validation range, while to evaluate whether the forecasting
algorithms fit correctly, the R2 metric is used. It is important to note that these metrics
evaluate different aspects; the RMSE, MAPE, and MAE evaluate the forecasting error, while
R2 allows to analyze how well a regression model fits the real data.

3.2.1. Temperature Forecasting

Table 7 shows the results of the evaluation metrics: root mean square error (RMSE),
mean absolute percentage error (MAPE), mean absolute error (MAE), and coefficient
of determination (R2 ) for each of the techniques used for temperature forecasting. The
calculation of the root mean square error, mean absolute percentage error, and mean
absolute error was obtained by averaging the errors of the validation data (576 data), while
the calculation of the coefficient of determination (R2 ) used the data from the training set
and the test set (93,780 data).

Table 7. Evaluation metrics for temperature forecasting.

Coefficient of Mean Absolute Mean Absolute Percentage Mean Square Error

Technique
Determination (R2 ) Error (MAE) [◦ C] Error (MAPE) [%] (RMSE) [◦ C]
Multiple linear
0.8244 0.6597 3.71 0.8453
regression
Polynomial regression 0.8406 0.6097 3.51 0.8146
Decision tree 0.8593 0.5097 2.95 0.7333
Random forest 0.8631 0.4728 2.73 0.6621
XGboost 0.8599 0.5335 3.09 0.7565
Multilayer perceptron 0.8226 0.9124 5.51 1.2498

Table 7 shows that R2 obtained from the implemented algorithms converge to appro-
priate values, i.e., there is a correct approximation between the real temperature and the
predicted temperature, thus guaranteeing the good performance of the algorithm, which
Electronics 2023, 12, 1007 11 of 19

allows a comparison of the performance in terms of forecast error. Comparison of the root
mean square errors (RMSE), mean absolute percentage errors (MAPE), and mean absolute
errors (MAE), and analysis of the coefficient of determination R2 of the different techniques
implemented show that the best performing technique for forecasting the temperature
variable is Random Forest, with an R2 of 0.8631, MAE of 0.4728 ◦ C, MAPE of 2.73%, and
RMSE of 0.6621 ◦ C. This is followed by XGBoost, with an R2 of 0.8599, MAE of 0.5335 ◦ C,
MAPE of 3.09%, and RMSE of 0.7565 ◦ C.
Figure 5 shows the real (red) and prediction (blue) profiles using the different Machine
Learning techniques to predict the temperature variable: (a) Multiple linear regression
technique, (b) Polynomial regression technique, (c) Decision tree technique, (d) Random
Forest technique, (e) XGboost technique, (f) Multilayer perceptron neural network tech-
nique. Figure 5c,d, validate that the best performance corresponds to the Decision tree and
Random forest techniques.

3.2.2. Relative Humidity Forecasting

Table 8 shows the results of the evaluation metrics: root mean square error (RMSE),
mean absolute percentage error (MAPE), mean absolute error (MAE), and coefficient of
determination (R2 ) for each of the techniques used for relative humidity forecasting. The
calculation of the root mean square error, mean absolute percentage error, and mean
absolute error was obtained by averaging the errors of the validation data (576 data), while
the calculation of the coefficient of determination (R2 ) used the data from the training set
and the test set (93,780 data).

Table 8. Evaluation metrics for relative humidity forecasting.

Coefficient of Mean Absolute Mean Absolute Percentage Mean Square Error

Technique
Determination (R2 ) Error (MAE) [RH] Error (MAPE) [%] (RMSE) [RH]
Multiple linear
0.7815 3.0900 3.56 3.7475
regression
Polynomial regression 0.8420 2.2816 2.68 3.0163
Decision tree 0.8547 2.2685 2.65 3.2083
Random forest 0.8583 2.1380 2.50 2.9003
XGboost 0.8597 2.2907 2.67 3.1444
Multilayer perceptron 0.8013 4.6055 5.64 5.5759

Table 8 shows that R2 obtained from the implemented algorithms converge to appro-
priate values, i.e., there is a correct approximation between the real relative humidity and
the predicted relative humidity, thus guaranteeing the good performance of the algorithm,
which allows a comparison of the performance in terms of forecast error. Comparison
of the root mean square errors (RMSE), mean absolute percentage errors (MAPE), and
mean absolute errors (MAE), and analysis of the coefficient of determination R2 of the
different techniques implemented show that the best performing techniques for forecasting
the relative humidity variable are Random Forest, with an R2 of 0.8583, MAE of 2.1380 RH,
MAPE of 2.50%, and RMSE of 2.9003 RH; and XGBoost, with an R2 of 0.8597, MAE of
2.2907 RH, MAPE of 2.67%, and RMSE of 3.1444 RH.
Figure 6 shows the real (red) and prediction (blue) profiles using the different Machine
Learning techniques to predict the relative humidity variable: (a) Multiple linear regression
technique, (b) Polynomial regression technique, (c) Decision tree technique, (d) Random
forest technique, (e) XGboost technique, (f) Multilayer perceptron neural network technique.
Figure 6d and Figure 6c validate that the best performance corresponds to the Random
forest and Decision tree techniques.
Polynomial regression 0.8406 0.6097 3.51 0.8146
Decision tree 0.8593 0.5097 2.95 0.7333
Random forest 0.8631 0.4728 2.73 0.6621
XGboost 0.8599 0.5335 3.09 0.7565
Multilayer perceptron
Electronics 2023, 12, 1007 0.8226 0.9124 5.51 1.2498 12 of 19

Electronics 2023, 12, x FOR PEER REVIEW 12 of 20

(a) (b)

(e) (f)
Figure 5. Temperature forecast techniques: (a) Multiple linear regression, (b) Polynomial regression,
Figure 5. Temperature forecast techniques: (a) Multiple linear regression, (b) Polynomial regression,
(c) Decision tree, (d) Random forest, (e) XGboost, (f) Multilayer perceptron neural network.
(c) Decision tree, (d) Random forest, (e) XGboost, (f) Multilayer perceptron neural network.
3.2.2. Relative Humidity Forecasting
Table 8 shows the results of the evaluation metrics: root mean square error (RMSE),
mean absolute percentage error (MAPE), mean absolute error (MAE), and coefficient of
determination (R ) for each of the techniques used for relative humidity forecasting. The
calculation of the root mean square error, mean absolute percentage error, and mean ab-
solute error was obtained by averaging the errors of the validation data (576 data), while
the calculation of the coefficient of determination (R ) used the data from the training set
and the test set (93,780 data).
Table 8 shows that R obtained from the implemented algorithms converge to ap-
propriate values, i.e., there is a correct approximation between the real relative humidity
and the predicted relative humidity, thus guaranteeing the good performance of the algo-
(MAPE) [%]
Multiple linear regression 0.7815 3.0900 3.56 3.7475
Polynomial regression 0.8420 2.2816 2.68 3.0163
Decision tree 0.8547 2.2685 2.65 3.2083
Random forest 0.8583 2.1380 2.50 2.9003
Electronics 2023, 12, 1007
XGboost 0.8597 2.2907 2.67 3.1444 13 of 19
Multilayer perceptron 0.8013 4.6055 5.64 5.5759

(a) (b)

Electronics 2023, 12, x FOR PEER REVIEW 14 of 20

(e) (f)
Figure 6. Techniques for relative humidity forecasting: (a) Multiple linear regression, (b) Polynomial
Figure 6. Techniques for relative humidity forecasting: (a) Multiple linear regression, (b) Polynomial
regression, (c) Decision tree, (d) Random forest, (e) XGboost, (f) Multilayer perceptron neural net-
regression,
work. (c) Decision tree, (d) Random forest, (e) XGboost, (f) Multilayer perceptron neural network.

3.2.3. Solar Radiation Forecasting

Table 9 shows the results of the evaluation metrics: root mean square error (RMSE),
mean absolute error (MAE), and coefficient of determination (R ) for each of the tech-
niques used for solar radiation forecasting. The calculation of the root mean square error,
Electronics 2023, 12, 1007 14 of 19

3.2.3. Solar Radiation Forecasting

Table 9 shows the results of the evaluation metrics: root mean square error (RMSE),
mean absolute error (MAE), and coefficient of determination (R2 ) for each of the techniques
used for solar radiation forecasting. The calculation of the root mean square error, and
mean absolute error was obtained by averaging the errors of the validation data (576 data),
while the calculation of the coefficient of determination (R2 ) used the data from the training
set and the test set (93,780 data).

Table 9. Evaluation metrics for solar radiation forecasting.

Coefficient of Determination Mean Absolute Error (MAE) Mean Square Error (RMSE)
Technique
(R2 ) [W/m2 ] [W/m2 ]
Multiple linear regression 0.6689 106.9741 164.7435
Polynomial regression 0.7394 76.6667 129.1836
Decision tree 0.7253 75.8177 127.3530
Random forest 0.7333 65.8105 105.9141
XGboost 0.7075 87.6137 145.0170
Multilayer perceptron 0.7423 88.5897 140.0681

Table 9 shows that R2 obtained from the implemented algorithms converge to appro-
priate values, i.e., there is a correct approximation between the real solar radiation and
the predicted solar radiation, thus guaranteeing the good performance of the algorithm,
which allows a comparison of the performance in terms of forecast error. Comparison of
the root mean square errors (RMSE), and mean absolute errors (MAE), and analysis of the
coefficient of determination R2 of the different techniques implemented show that the best
performing techniques for forecasting the solar radiation variable are Random Forest with
an R2 of 0.7333, MAE of 65.8105 W/m2 , and RMSE of 105.9141 W/m2 ; and Decision Tree
with an R2 of 0.7253, MAE of 75.8177 W/m2 , and RMSE of 127.3530 W/m2 .
Figure 7 shows the real (red) and prediction (blue) profiles using the different Machine
Learning techniques to predict the variable solar radiation: (a) Multiple linear regression
technique, (b) Polynomial regression technique, (c) Decision tree technique, (d) Random
forest technique, (e) XGboost technique, (f) Multilayer perceptron neural network technique.
Figure 7d validates that the best performance corresponds to the Random forest technique.

3.2.4. Wind Speed Forecasting

Table 10 shows the results of the evaluation metrics: root mean square error (RMSE),
mean absolute error (MAE), and coefficient of determination (R2 ) for each of the techniques
used for wind speed forecasting. The calculation of the root mean square error and mean
absolute error was obtained by averaging the errors of the validation data (576 data), while
the calculation of the coefficient of determination (R2 ) used the data from the training set
and the test set (93,780 data).
Table 10 shows that R2 obtained from the implemented algorithms converge to appro-
priate values, i.e., there is an acceptable approximation between the real wind speed and
the predicted wind speed, thus guaranteeing the good performance of the algorithm, which
allows a comparison of the performance in terms of forecast error. Comparison of the root
mean square errors (RMSE) and mean absolute errors (MAE) and analysis of the coefficient
of determination R2 of the different techniques implemented show that the best performing
techniques for forecasting the wind speed variable are Random Forest with an R2 of 0.3660,
MAE of 0.1097 m/s, and RMSE of 0.2136 m/s; and XGBoost with an R2 of 0.3866, MAE
of 0.1439 m/s, and RMSE of 0.3131 m/s. It should be taken into account that due to the
high variability of wind speed, the implemented techniques have a lower coefficient of
determination compared to the other variables; however, forecasts with acceptable errors
were obtained. In this case, the value of the mean absolute percentage errors (MAPE) is
Electronics 2023, 12, x FOR PEER REVIEW 15 of 20

Electronics 2023, 12, 1007 15 of 19

XGboost 0.7075 87.6137 145.0170

Multilayer perceptron 0.7423 88.5897 140.0681
not taken into account because it is used only when it is known that the quantity to be
predicted remains well above 0.

(a) (b)

(e) (f)

Figure 7. Solar radiation forecast techniques: (a) Multiple linear regression, (b) Polynomial regression,
(c) Decision tree, (d) Random forest, (e) XGboost, (f) Multilayer perceptron neural network.
Electronics 2023, 12, 1007 16 of 19

Table 10. Evaluation metrics for wind speed forecasting.

Coefficient of Determination Mean Absolute Error Mean Square Error

Technique
(R2 ) (MAE) [m/s] (RMSE) [m/s]
Multiple linear regression 0.3428 0.1614 0.3354
Polynomial regression 0.3770 0.1428 0.3159
Decision tree 0.2142 0.1256 0.2705
Random forest 0.3660 0.1097 0.2136
XGboost 0.3866 0.1439 0.3131
Multilayer perceptron 0.3270 0.1654 0.3616

Figure 8 shows the real (red) and prediction (blue) profiles using the different Machine
Learning techniques to predict the wind speed variable: (a) Multiple linear regression
technique, (b) Polynomial regression technique, (c) Decision tree technique, (d) Random
Electronics 2023, 12, x FOR PEER REVIEW 17 of 20
forest technique, (e) XGboost technique, (f) Multilayer perceptron neural network technique.
Figure 8d validates that the best performance corresponds to the Random forest technique.

(a) (b)

Figure 8. Cont.
Electronics 2023, 12, 1007 17 of 19

(e) (f)
Figure 8. Techniques for wind speed forecast: (a) Multiple linear regression, (b) Polynomial regres-
Figure 8. Techniques for wind speed forecast: (a) Multiple linear regression, (b) Polynomial regression,
sion, (c) Decision tree, (d) Random forest, (e) XGboost, (f) Multilayer perceptron neural network.
(c) Decision tree, (d) Random forest, (e) XGboost, (f) Multilayer perceptron neural network.

4. Conclusions
For the forecasting of meteorological variables in this research, information obtained
from the Parque de la Familia Baños meteorological station located in Ecuador was used
and the following prediction techniques were tested: multiple linear regression, polynomial
regression, decision tree, random forest, XGBoost, and multilayer perceptron neural net-
work. For forecasting the temperature variable, a better result is obtained by using Random
Forest with an R2 of 0.8631, MAE of 0.4728 ◦ C, MAPE of 2.73%, and RMSE of 0.6621 ◦ C.
In addition, XGBoost also performed well with an R2 of 0.8599, MAE of 0.5335 ◦ C, MAPE
of 3.09%, and RMSE of 0.7565 ◦ C. For forecasting the relative humidity variable, a better
result is obtained by using Random Forest with an R2 of 0.8583, MAE of 2.1380 RH, MAPE
of 2.50%, and RMSE of 2.9003 RH. In addition, XGBoost also performed well with an R2
of 0.8597, MAE of 2.2907 RH, MAPE of 2.67%, and RMSE of 3.1444 RH. For forecasting
the solar radiation variable, a better result is obtained by using Random Forest with an
R2 of 0.7333, MAE of 65.8105 W/m2 , and RMSE of 105.9141 W/m2 . In addition, Deci-
sion Tree also performed well with an R2 of 0.7253, MAE of 75.8177 W/m2 , and RMSE
of 127.3530 W/m2 . For forecasting the wind speed variable, a better result is obtained by
using Random Forest, with an R2 of 0.3660, MAE of 0.1097 m/s, and RMSE of 0.2136 m/s.
In addition, XGBoost also performed well, with an R2 of 0.3866, MAE of 0.1439 m/s, and
RMSE of 0.3131 m/s.
It can be observed that wind speed has the highest variability compared to the other
predicted variables, therefore, the results of the techniques implemented show that the
coefficient of determination R2 of this variable has a lower value. This is due to the type of
signal we are trying to predict; however, acceptable predictions were obtained.
The prediction of meteorological variables (temperature, solar radiation, wind speed,
and relative humidity) will allow future projects to be implemented in the study area, such
as intelligent agriculture to support food problems in that area and the implementation
of a microgrid based on renewable resources where prediction models will support the
planning and operation of the microgrid in real time, allowing clean energy to this locality,
contributing to the reduction in the use of fossil resources, which is the goal that different
countries have set as part of their policies.
Electronics 2023, 12, 1007 18 of 19

Author Contributions: Conceptualization, J.A.S., J.F.T., J.R.L. and D.R.R.; methodology, J.A.S., J.F.T.,
J.R.L. and D.R.R.; software J.A.S. and J.F.T.; validation, J.A.S. and J.F.T.; formal analysis, J.A.S., J.F.T.,
J.R.L. and D.R.R.; investigation, J.A.S., J.F.T. and J.R.L.; resources, J.A.S. and J.F.T.; data curation,
J.A.S. and J.F.T.; writing—original draft preparation, J.A.S., J.F.T., J.R.L. and D.R.R..; writing—review
and editing, J.A.S., J.F.T., J.R.L. and D.R.R.; visualization, J.A.S., J.F.T., J.R.L. and D.R.R.; supervision,
J.R.L. and D.R.R.; project administration, J.R.L.; funding acquisition, J.R.L. All authors have read and
agreed to the published version of the manuscript.
Funding: This research received no external funding.
Data Availability Statement: Not applicable.
Acknowledgments: This work was supported in part by the Universidad de las Fuerzas Armadas
ESPE through the Project “Optimal energy management systems for hybrid generation systems”,
under Project 2023-pis-03. In addition, the authors would like to thank to the project EE-GNP-0043-
2021-ESPE, REDTPI4.0-CYTED, Conv-2022-05-UNACH, “SISMO-ROSAS”–UPS.
Conflicts of Interest: The authors declare no conflict of interest.

References
1. Ayala, M.F. Analisis de la Dinamica Caoticapara la Series Temporales de Variables Meteorologicas en la Estacion Climatica de Chone;
Universidad de las Fuerzas Armadas ESPE: Sangolquí, Ecuador, 2017; Available online: http://repositorio.espe.edu.ec/handle/
21000/13629 (accessed on 24 November 2022).
2. Erdil, A.; Arcaklioglu, E. The prediction of meteorological variables using artificial neural network. Neural Comput. Appl. 2013, 22,
1677–1683. [CrossRef]
3. Ruiz-Ayala, D.C.; Vides-Herrera, C.A.; Pardo-García, A. Monitoreo de variables meteorológicas a través de un sistema inalámbrico
de adquisición de datos. Rev. Investig. Desarro. Innov. 2018, 8, 333–341. [CrossRef]
4. Inzunza, J.C. Meteorologia Descriptiva. Univ. Concepción Dep. Geofísica 2015, 1–34. Available online: http://www2.udec.cl/
~jinzunza/meteo/cap1.pdf (accessed on 24 November 2022).
5. Millán, H.; Kalauzi, A.; Cukic, M.; Biondi, R. Nonlinear dynamics of meteorological variables: Multifractality and chaotic
invariants in daily records from Pastaza, Ecuador. Theor. Appl. Climatol. 2010, 102, 75–85. [CrossRef]
6. Acurio, W.; Pilco, V. Técnicas Estadísticas para la Modelación y Predicción de la Temperatura y Velocidad del Viento en la Provincia de
Chimborazo; Escuela Superior Politénica de Chimborazo: Riobamba, Ecuador, 2019. Available online: http://dspace.espoch.edu.
ec/handle/123456789/10955 (accessed on 28 November 2022).
7. Tong, H. Non-Linear Time Series: A Dynamical System Approach; Oxford University Press: Oxford, UK, 1990.
8. Palma-Behnke, R.; Benavides, C.; Aranda, E.; Llanos, J.; Sáez, D. Energy management system for a renewable based microgrid
with a demand side management mechanism. In Proceedings of the IEEE Symposium on Computational Intelligence Applications
in Smart Grid 2011, Paris, France, 11–15 April 2011; pp. 1–8. [CrossRef]
9. Rodríguez, M.; Salazar, A.; Arcos-Aviles, D.; Llanos, J.; Martínez, W.; Motoasca, E. A Brief Approach of Microgrids Implementation
in Ecuador: A Review. In Lecture Notes in Electrical Engineering; Springer: Cham, Switzerland, 2021; Volume 762, pp. 149–163.
[CrossRef]
10. Llanos, J.; Morales, R.; Núñez, A.; Sáez, D.; Lacalle, M.; Marín, L.G.; Hernández, R.; Lanas, F. Load estimation for microgrid
planning based on a self-organizing map methodology. Appl. Soft Comput. 2017, 53, 323–335. [CrossRef]
11. Caquilpan, V.; Saez, D.; Hernandez, R.; Llanos, J.; Roje, T.; Nunez, A. Load estimation based on self-organizing maps and
Bayesian networks for microgrids design in rural zones. In Proceedings of the 2017 IEEE PES Innovative Smart Grid Technologies
Conference—Latin America (ISGT Latin America), Quito, Ecuador, 20–22 September 2017; pp. 1–6. [CrossRef]
12. Palma-Behnke, R.; Benavides, C.; Lanas, F.; Severino, B.; Reyes, L.; Llanos, J.; Saez, D. A microgrid energy management system
based on the rolling horizon strategy. IEEE Trans. Smart Grid 2013, 4, 996–1006. [CrossRef]
13. Rey, J.M.; Vera, G.A.; Acevedo-Rueda, P.; Solano, J.; Mantilla, M.A.; Llanos, J.; Sáez, D. A Review of Microgrids in Latin America:
Laboratories and Test Systems. IEEE Lat. Am. Trans. 2022, 20, 1000–1011. [CrossRef]
14. Javier, G.; Quevedo-Nolasco, A.; Castro-Popoca, M.; Arteaga-Ramírez, R.; Vázquez-Peña, M.A.; Zamora-Morales, B.P.; Aguado-
Rodríguez, G.J.; Quevedo-Nolasco, A.; Castro-Popoca, M.; Arteaga-Ramírez, R.; et al. Predicción de Variables Meteorológicas por
Medio de Modelos Arima. Agrociencia 2016, 50, 1–13.
15. Gulhane, V.A.; Rode, S.V.; Pande, C.B. Correlation Analysis of Soil Nutrients and Prediction Model Through ISO Cluster
Unsupervised Classification with Multispectral Data. Springer Link 2022, 82, 2165–2184. [CrossRef]
16. Pande, C.B.; Al-Ansari, N.; Kushwaha, N.L.; Srivastava, A.; Noor, R.; Kumar, M.; Moharir, K.N.; Elbeltagi, A. Forecasting of SPI
and Meteorological Drought Based on the Artificial Neural Network and M5P Model Tree. Land 2022, 11, 2040. [CrossRef]
17. Mora Cunllo, V.E. Diseño e Implementación de un Modelo Software Basado en Técnicas de Inteligencia Artificial, para Predecir el
índice de Radiación Solar en Riobamba-Ecuador. 2015. Available online: http://repositorio.espe.edu.ec/bitstream/21000/12216/
1/T-ESPEL-MAS-0027.pdf (accessed on 24 November 2022).
Electronics 2023, 12, 1007 19 of 19

18. Universitario, S.; Estad, E.N.; Aplicada, S.; Fern, R.A.; Javier, F.; Morales, A. Series Temporales Avanzadas: Aplicación de Redes
Neuronales para el Pronóstico de Series de Tiempo; Universidad de Granada: Granada, Spain, 2021.
19. Raschka, S.; Patterson, J.; Nolet, C. Machine learning in python: Main developments and technology trends in data science,
machine learning, and artificial intelligence. Information 2020, 11, 193. [CrossRef]
20. Carlos, J.; Rodriguez, M. Desarrollo de una Herramienta Inteligente Centrada en Visión Plantaciones de Arroz, Usando Lenguaje
de Programación Python. Ph.D. Thesis, Universidad de Guayaquil, Guayaquil, Ecuador, 2022.
21. Ben Bouallègue, Z.; Cooper, F.; Chantry, M.; Düben, P.; Bechtold, P.; Sandu, I. Statistical modelling of 2m temperature and 10m
wind speed forecast errors. Mon. Weather. Rev. 2022. Available online: https://journals.ametsoc.org/view/journals/mwre/aop/
MWR-D-22-0107.1/MWR-D-22-0107.1.xml (accessed on 18 January 2023). [CrossRef]
22. Montero Granados, R. Modelos de Regresión Lineal Múltiple; Technical Report; Documentos de Trabajo en Economía Aplicada;
Universidad de Granada: Granada, Spain, 2006.
23. Aurélien, G. Hands-on Machine Learning with Scikit-Learn & Tensorflow; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2017.
24. Elbeltagi, A.; Kumar, M.; Kushwaha, N.L.; Pande, C.B.; Ditthakit, P.; Vishwakarma, D.K.; Subeesh, A. Drought indicator analysis
and forecasting using data driven models: Case study in Jaisalmer, India. Stoch. Environ. Res. Risk Assess. 2022, 37, 113–131.
[CrossRef]
25. Luckner, M.; Topolski, B.; Mazurek, M. Application of XGBoost Algorithm. Data Anal. 2017, 10244, 661–671. [CrossRef]
26. Menacho Chiok, C.H. Modelos de regresión lineal con redes neuronales. An. Científicos 2014, 75, 253. [CrossRef]
27. Popescu, M.C.; Balas, V.E.; Perescu-Popescu, L.; Mastorakis, N. Multilayer perceptron and neural networks. WSEAS Trans.
Circuits Syst. 2009, 8, 579–588.
28. Soto-Bravo, F.; González-Lutz, M.I. Analysis of statistical methods to evaluate the performance of simulation models in horticul-
tural crops. Agron. Mesoam. 2019, 30, 517–534. [CrossRef]
29. Gopi, A.; Sharma, P.; Sudhakar, K.; Ngui, W.K.; Kirpichnikova, I.; Cuce, E. Weather Impact on Solar Farm Performance: A
Comparative Analysis of Machine Learning Techniques. Sustainability 2023, 15, 439. [CrossRef]
30. de Myttenaere, A.; Golden, B.; Le Grand, B.; Rossi, F. Mean Absolute Percentage Error for regression models. Neurocomputing
2016, 192, 38–48. [CrossRef]
31. Chai, T.; Draxler, R.R. Root mean square error (RMSE) or mean absolute error (MAE)? -Arguments against avoiding RMSE in the
literature. Geosci. Model Dev. 2014, 7, 1247–1250. [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

Nptel Big Data Full Assignment Solution 2021
100% (8)
Nptel Big Data Full Assignment Solution 2021
36 pages
Weather Prediction Using Machine Learning Techniquess
No ratings yet
Weather Prediction Using Machine Learning Techniquess
53 pages
Machine Learning With Boosting
100% (1)
Machine Learning With Boosting
212 pages
Final Project Report
No ratings yet
Final Project Report
14 pages
R1-Weather Prediction Mode1
No ratings yet
R1-Weather Prediction Mode1
7 pages
Weather Prediction System
No ratings yet
Weather Prediction System
17 pages
Untitled Document
No ratings yet
Untitled Document
7 pages
Weather Forecasting Using Decision Tree Regression
No ratings yet
Weather Forecasting Using Decision Tree Regression
7 pages
REPORT
No ratings yet
REPORT
13 pages
3
No ratings yet
3
2 pages
IoT Framework For Real Time Weather Monitoring Using Machine Learning Techniques
No ratings yet
IoT Framework For Real Time Weather Monitoring Using Machine Learning Techniques
7 pages
Rainfall Prediction Using Machine Learning
100% (1)
Rainfall Prediction Using Machine Learning
6 pages
3.machine Learning Using Smart Weather Forecasting
No ratings yet
3.machine Learning Using Smart Weather Forecasting
6 pages
Weather Forecasting Over Iraq Using Machine Learning
No ratings yet
Weather Forecasting Over Iraq Using Machine Learning
7 pages
Environsciproc 26 00049
No ratings yet
Environsciproc 26 00049
6 pages
DaoGiaKhanh Weather Forecasting Using MachineLearning
No ratings yet
DaoGiaKhanh Weather Forecasting Using MachineLearning
8 pages
Predicting Weather Forecaste Uncertainty With Machine Learning
No ratings yet
Predicting Weather Forecaste Uncertainty With Machine Learning
17 pages
Machine Learning Methods To Weather Forecasting To Predict Apparent Temperature
100% (1)
Machine Learning Methods To Weather Forecasting To Predict Apparent Temperature
11 pages
A Survey of Weather Forecasting Based On Machine Learning and Deep Learning Techniques
No ratings yet
A Survey of Weather Forecasting Based On Machine Learning and Deep Learning Techniques
6 pages
Paper 8-Weather Prediction Using Linear Regression Model-Bnmit IITCEE ICCCI - Conference-1
No ratings yet
Paper 8-Weather Prediction Using Linear Regression Model-Bnmit IITCEE ICCCI - Conference-1
4 pages
Innovative Machine Learning Approaches For Prediction of Weather Parameters
No ratings yet
Innovative Machine Learning Approaches For Prediction of Weather Parameters
8 pages
IJCRT2404206
No ratings yet
IJCRT2404206
6 pages
An Intelligent Regression Approach For Weather Forecasting System Using Machine Learning
No ratings yet
An Intelligent Regression Approach For Weather Forecasting System Using Machine Learning
6 pages
AI Project
No ratings yet
AI Project
30 pages
Document
No ratings yet
Document
3 pages
Atmosphere 14 01174
No ratings yet
Atmosphere 14 01174
20 pages
SSRN Id3380834 Code3457479 240609 192018
No ratings yet
SSRN Id3380834 Code3457479 240609 192018
6 pages
Dynamic Modeling Technique For Weather Prediction: Jyotismita Goswami
No ratings yet
Dynamic Modeling Technique For Weather Prediction: Jyotismita Goswami
8 pages
Atmosphere 13 00180
No ratings yet
Atmosphere 13 00180
17 pages
Pavuluri 2020
No ratings yet
Pavuluri 2020
6 pages
Machine Learning Approaches Usedfor Weather Attributes Forecasting
No ratings yet
Machine Learning Approaches Usedfor Weather Attributes Forecasting
5 pages
DTI
No ratings yet
DTI
8 pages
Atmosphere 14 01635
No ratings yet
Atmosphere 14 01635
27 pages
Rainfall Prediction Using Machine Learning
No ratings yet
Rainfall Prediction Using Machine Learning
6 pages
Weather Prediction Using Machine Learning and IOT: Gopinath N, Vinodh S, Prashanth P, Jayasuriya A, Deasione S
No ratings yet
Weather Prediction Using Machine Learning and IOT: Gopinath N, Vinodh S, Prashanth P, Jayasuriya A, Deasione S
5 pages
1st Paper On Weather Prediction
No ratings yet
1st Paper On Weather Prediction
4 pages
Weather Prediction With Machine Learning
No ratings yet
Weather Prediction With Machine Learning
5 pages
Prediction of Rainfall Using Machine Lea
No ratings yet
Prediction of Rainfall Using Machine Lea
5 pages
Team Autorecovered
No ratings yet
Team Autorecovered
19 pages
Application of Machine Learning Techniques in Temperature Forecast
No ratings yet
Application of Machine Learning Techniques in Temperature Forecast
6 pages
Aml Weather
No ratings yet
Aml Weather
6 pages
A Flexible and Lightweight Deep Learning Weather Forecasting Model
No ratings yet
A Flexible and Lightweight Deep Learning Weather Forecasting Model
12 pages
Applied Sciences: An ANN Model Trained On Regional Data in The Prediction of Particular Weather Conditions
No ratings yet
Applied Sciences: An ANN Model Trained On Regional Data in The Prediction of Particular Weather Conditions
46 pages
Machine Learning Methods in Climate Prediction A S
No ratings yet
Machine Learning Methods in Climate Prediction A S
31 pages
Electronics 11 02359
No ratings yet
Electronics 11 02359
15 pages
Anjali 2019
No ratings yet
Anjali 2019
5 pages
Mathematics: Analysis of A Predictive Mathematical Model of Weather Changes Based On Neural Networks
No ratings yet
Mathematics: Analysis of A Predictive Mathematical Model of Weather Changes Based On Neural Networks
17 pages
Weather Forecasting Basepaper
100% (1)
Weather Forecasting Basepaper
14 pages
PublishedPaperNo.8 2022
100% (1)
PublishedPaperNo.8 2022
14 pages
Sustainability 15 00757 v2
No ratings yet
Sustainability 15 00757 v2
15 pages
Latex Report Main 1
No ratings yet
Latex Report Main 1
26 pages
(IJCST-V12I1P6) :kaushik Kashyap, Rinku Moni Borah, Priyanku Rahang, DR Bornali Gogoi, Prof. Nelson R Varte
No ratings yet
(IJCST-V12I1P6) :kaushik Kashyap, Rinku Moni Borah, Priyanku Rahang, DR Bornali Gogoi, Prof. Nelson R Varte
5 pages
Weather Prediction System Using Machine Learning
No ratings yet
Weather Prediction System Using Machine Learning
4 pages
An Interactive Framework For Analysis of Weather Prediction Using Digital Image Processing
100% (1)
An Interactive Framework For Analysis of Weather Prediction Using Digital Image Processing
8 pages
Sustainability 15 10816
No ratings yet
Sustainability 15 10816
22 pages
Rainfall Prediction Using Machine Learning
No ratings yet
Rainfall Prediction Using Machine Learning
5 pages
Weather Forecasting and Prediction Using Hybrid C5.0
100% (1)
Weather Forecasting and Prediction Using Hybrid C5.0
14 pages
10 1109@icesc48915 2020 9155571
No ratings yet
10 1109@icesc48915 2020 9155571
4 pages
Rainfall Prediction System 2
No ratings yet
Rainfall Prediction System 2
6 pages
Supercomputing for a Changing Climate: Modeling and Predicting Environmental Futures: O7.0 TRANSFORM INFORMATION TECHNOLOGY
From Everand
Supercomputing for a Changing Climate: Modeling and Predicting Environmental Futures: O7.0 TRANSFORM INFORMATION TECHNOLOGY
Elizabeth Mogopodi
No ratings yet
Remote Sensing Technology
From Everand
Remote Sensing Technology
Rajendra Asan
No ratings yet
AI and Robotics Applications in Disaster Response
From Everand
AI and Robotics Applications in Disaster Response
Menka Chopra
No ratings yet
2-IJCI Vol. 3 No. 3-March 2024-Paper1-Dr. Elrasheed
No ratings yet
2-IJCI Vol. 3 No. 3-March 2024-Paper1-Dr. Elrasheed
33 pages
Certified Professional Diploma in Data Science-1
No ratings yet
Certified Professional Diploma in Data Science-1
43 pages
Nimish
No ratings yet
Nimish
4 pages
Nitin Jha (05114802819)
No ratings yet
Nitin Jha (05114802819)
21 pages
Panjala Sravani, V. Rama Krishna: Prospective Projection On Covid-19 Utilising ML Algorithms
No ratings yet
Panjala Sravani, V. Rama Krishna: Prospective Projection On Covid-19 Utilising ML Algorithms
8 pages
Disease Presiction
No ratings yet
Disease Presiction
32 pages
A General Guide To Applying Machine Learning To Computer Architecture - Marked
No ratings yet
A General Guide To Applying Machine Learning To Computer Architecture - Marked
21 pages
IEEE Xplore Citation Plain Text Download 2025.1.5.19.1.38
No ratings yet
IEEE Xplore Citation Plain Text Download 2025.1.5.19.1.38
9 pages
Classification of Tongue - Glossitis Abnormality: Ashiqur Rahman, Amr Ahmed, Shigang Yue
No ratings yet
Classification of Tongue - Glossitis Abnormality: Ashiqur Rahman, Amr Ahmed, Shigang Yue
4 pages
Article PP 1416-1433
No ratings yet
Article PP 1416-1433
18 pages
Irjmets Journal
No ratings yet
Irjmets Journal
13 pages
AdaBoost & DIfference Between Adaboost and Random Forest
No ratings yet
AdaBoost & DIfference Between Adaboost and Random Forest
6 pages
Advanced Machine Learning Model To Detect Spam On Instagram
No ratings yet
Advanced Machine Learning Model To Detect Spam On Instagram
6 pages
AI ML Question Bank With Answers
No ratings yet
AI ML Question Bank With Answers
29 pages
Manifold Oblique Random Forests: Towards Closing The Gap On Convolutional Deep Networks
No ratings yet
Manifold Oblique Random Forests: Towards Closing The Gap On Convolutional Deep Networks
33 pages
Deep Hedging of Financial Options
No ratings yet
Deep Hedging of Financial Options
28 pages
Predictthe Valueof Football Players Using FIFAvideogamedataand Machine Learning Techniques
No ratings yet
Predictthe Valueof Football Players Using FIFAvideogamedataand Machine Learning Techniques
16 pages
Stress Detection With Machine Learning and Deep
No ratings yet
Stress Detection With Machine Learning and Deep
7 pages
Customer Credit Risk Application and Evaluation of Machine Learning and Deep Learning Models
No ratings yet
Customer Credit Risk Application and Evaluation of Machine Learning and Deep Learning Models
5 pages
Report
No ratings yet
Report
14 pages
ET - Project Presentation Solution
No ratings yet
ET - Project Presentation Solution
29 pages
Ensemble Methods Vs Traditional ML Approaches An Empirical Analysis On Web Based Attack Detection in The Context of Industry 5-CHAKIR OUMAIMA
No ratings yet
Ensemble Methods Vs Traditional ML Approaches An Empirical Analysis On Web Based Attack Detection in The Context of Industry 5-CHAKIR OUMAIMA
23 pages
ML 2 (Mainly KNN)
100% (1)
ML 2 (Mainly KNN)
12 pages
Project New
No ratings yet
Project New
13 pages
Prediction of Kidney Failure Disease by Using Machine Learning
No ratings yet
Prediction of Kidney Failure Disease by Using Machine Learning
45 pages
Auto Value Estimation Predicting Car Price
No ratings yet
Auto Value Estimation Predicting Car Price
29 pages
Customer Churn Prediction For A Retail
No ratings yet
Customer Churn Prediction For A Retail
8 pages
Thesis Presentation - Martin Ieong
No ratings yet
Thesis Presentation - Martin Ieong
25 pages

Electronics 12 01007

Uploaded by

Electronics 12 01007

Uploaded by

electronics

Keywords: machine learning; forecasting models; meteorological variables; Python

Electronics 2023, 12, 1007. https://doi.org/10.3390/electronics12041007 https://www.mdpi.com/journal/electronics

2. Design of Forecasting Models for Meteorological Variables

2.2. Data Preprocessing

2.3. Dataset Division

2.4. Design of the Forecasting Models

where: X1 , X2 , . . . Xn : are the predictor or independent variables, b1 , b2 , . . . bn : coefficients

Table 1. Tuning parameters for the multiple linear regression techniques.

Multiple Linear Regression

2.4.2. Polynomial Regression

where: X1 , X2 , . . . Xn : are the predictor or independent variables, b1 , b2 , . . . bn : coefficients

Table 2. Tuning parameters for polynomial regression technique.

2.4.3. Decision Tree

Table 3. Tuning parameters for the decision tree technique.

2.4.4. Random Forest

Figure 4. Structure of a4.multilayer

Neural Network—Multilayer Perceptron𝑦 = 𝑓 𝑋𝑊 −𝜃

3.1.1. Coefficient of Determination (R2 )

3.1.2. Mean Square Error (RMSE)

3.1.3. Mean Absolute Percentage Error (MAPE)

3.1.4. Mean Absolute Error (MAE)

3.2. Case Study

3.2.1. Temperature Forecasting

Table 7. Evaluation metrics for temperature forecasting.

Coefficient of Mean Absolute Mean Absolute Percentage Mean Square Error

3.2.2. Relative Humidity Forecasting

Table 8. Evaluation metrics for relative humidity forecasting.

Coefficient of Mean Absolute Mean Absolute Percentage Mean Square Error

Electronics 2023, 12, x FOR PEER REVIEW 12 of 20

Electronics 2023, 12, x FOR PEER REVIEW 14 of 20

3.2.3. Solar Radiation Forecasting

3.2.3. Solar Radiation Forecasting

Table 9. Evaluation metrics for solar radiation forecasting.

3.2.4. Wind Speed Forecasting

Electronics 2023, 12, 1007 15 of 19

XGboost 0.7075 87.6137 145.0170

Table 10. Evaluation metrics for wind speed forecasting.

Coefficient of Determination Mean Absolute Error Mean Square Error

You might also like