Air Quality Index Forecasting Via Genetic Algorithm-Based Improved Extreme Learning Machine
ABSTRACT Air quality has always been one of the most important environmental concerns for the general
public and society. Using machine learning algorithms for Air Quality Index (AQI) prediction is helpful
for the analysis of future air quality trends from a macro perspective. When conventionally using a single
machine learning model to predict air quality, it is challenging to achieve a good prediction outcome under
various AQI fluctuation trends. In order to effectively address this problem, a genetic algorithm-based
improved extreme learning machine (GA-KELM) prediction method is enhanced. First, a kernel method
is introduced to produce the kernel matrix which replaces the output matrix of the hidden layer. To address
the issue of the conventional limit learning machine where the number of hidden nodes and the random
generation of thresholds and weights lead to the degradation of the network learning ability, a genetic
algorithm is then used to optimize the number of hidden nodes and layers of the kernel limit learning machine.
The thresholds, the weights, and the root mean square error are used to define the fitness function. Finally, the
least squares method is applied to compute the output weights of the model. Genetic algorithms are able to
find the optimal solution in the search space and gradually improve the performance of the model through an
iterative optimization process. In order to verify the predictive ability of GA-KELM, based on the collected
basic data of long-term air quality forecast at a monitoring point in a city in China, the optimized kernel
extreme learning machine is applied to predict air quality (SO2 , NO2 , PM10 , CO, O3 , PM2.5 concentration
and AQI), with comparative experiments based CMAQ (Community Multiscale Air Quality), SVM (Support
Vector Machines) and DBN-BP (Deep Belief Networks with Back-Propagation). The results show that the
proposed model trains faster and makes more accurate predictions.
INDEX TERMS Time series, air quality forecasting, machine learning, extreme learning machine, genetic
other illnesses [7], [8]. Therefore, studies on air quality performance and outperforms some advanced methods cur-
prediction are particularly important and are considered a rently in use. The main contributions of this paper are:
key factor for environmental protection. In order to more (1) modifying the ELM activation function or using the kernel
comprehensively assess the health effects of air pollution, function to improve the prediction accuracy, (2) optimizing
numerous air quality monitoring stations have been set up in the ELM using GA to improve the stability of the results and
major cities. Air quality predictions can be made based on further enhance the prediction accuracy, and (3) obtaining
the data collected from these stations. Air quality monitoring, the correlation analysis results of atmospheric environmental
modeling, and accurate predictions are important for having quality prediction parameters by comprehensively consider-
a clear understanding of future pollution levels and their ing each relevant factor in line with the actual situation. The
associated health risks. remainder of this paper is organized as follows. Section II
Recently, the inherent property of machine learning presents related work. Section III describes ELM and the
algorithms to automatically learn features at multiple levels of proposed GA-KELM, and illustrates the improvements using
abstraction has become increasingly important in providing the model. Section IV discusses experimental results where
solutions to this challenging task [9], [10]. However, the GA-KELM is compared with several other methods in terms
model only forecasts PM10 and SO2 levels, and it is of prediction results. The last section concludes the entire
also challenging to obtain measurement values needed to work and presents directions for future research.
construct the dataset [11]. Wu Q. et al. proposed an
optimal-hybrid model for daily AQI prediction considering
air pollutant factors, with the model’s inputs being the six II. RELATED WORK
atmospheric pollutants. However, neural networks typically Air quality prediction has been extensively researched in
struggle with slow learning, a tendency to fall into local the literature [17]. In recent years, numerous researchers
minima, and a complex network training process. Based have made significant contributions to the field by leveraging
on the generalized inverse matrix theory, Huang et al. quantitative studies and the latest techniques to identify
proposed an extreme learning machine (ELM) algorithm various air quality patterns and their underlying trends [18].
with a feedforward neural network that includes a single Existing work in this area relies on statistical methods and
hidden layer, such that the problems of conventional neural shallow machine learning models to address the problem of
network algorithms are circumvented. The ELM algorithm air quality prediction [19].
used to predict the AQI outperformed neural networks in Agarwal and Sahu [20] conducted air quality prediction
terms of parameter selection, training speed, and prediction studies by employing statistical models. Lary et al. [21] com-
accuracy [12]. However, the parameters of the hidden layer bined remote sensing and meteorological data with ground-
nodes and the number of nodes in the test hidden layer are based PM2.5 observations. Zheng et al. [22] proposed a hybrid
selected at random, which puts the prediction accuracy to a prediction method that combines a linear regression- based
great test. temporal prediction method with an ANN-based spatial pre-
In order to solve the aforementioned problems, we propose diction method for pollutant concentrations. Zheng et al. [23]
to optimize the number of ELM hidden layer nodes, used a data-based approach for the next 48 hours of
thresholds, and weights, along with an improved genetic PM2.5 prediction, implementing a prediction model based
algorithm (GA) that uses root mean square error (RMSE) as on linear regression and neural network. They combined
the fitness function, to obtain the optimal network structure meteorological data, weather forecast data, and air quality
for air quality prediction [14]. The number of hidden layer data from monitoring stations. Rajput and Sharma [24] used
nodes is updated by continuous coding discretization, the a multiple regression model to represent the changes in
input weights and hidden layer thresholds are updated by air quality index (AQI), considering ambient temperature,
continuous coding, and the update thresholds and weights relative humidity, and barometric pressure as the main
are selected with the number of updated layers to form a parameters in the regression model for AQI calculation [25].
hierarchical control structure [15]. The proposed GA-based These classical methods and models all have the advantages
improved extreme learning machine (GA-KELM) algorithm of simple algorithms, easy processing, and acceptable
is applied to air quality prediction, and its performance is prediction results. However, obtaining precise and specific air
compared with that of community multiscale air quality quality prediction values remains challenging [26].
modeling system (CMAQ), support vector regression (SVR), Elbaz et. al. [27] proposed a novel deep learning approach
and deep belief network-back propagation (DBN-BP). The that extracts high-level abstractions to capture the spatiotem-
results show that the accuracy of the proposed GA-KELM poral characteristics of NEOM city in Saudi Arabia at
algorithm is reliable for air quality prediction [16]. hourly and daily intervals. Campbell et al. [28] described
In this study, an improved extreme learning machine the development of FV3GFSv16 coupled with the ‘‘state-
model based on a genetic algorithm is designed and of-the-art’’ CMAQ model version 5.3.1. Jin et al. [29]
applied to AQI prediction. To verify the effectiveness of proposed an interpretable variational Bayesian deep learning
the model, we conducted tests on three real-world datasets. model with self-filtering capability for PM2.5 prediction
The results confirmed that the proposed method has superior information, which effectively improves prediction accuracy.
Zhou et al. [30], [31], [32] proposed a method based on an The AQI of each pollutant, or individual AQI (IAQI), is the
improved Grasshopper optimization algorithm to classify the largest value of the day. AQI is calculated as in (1):
color difference of dyed fabrics using kernel extreme learning
machine. In this study, the classification of color differences AQI = max {LAQI1 , L AQI2 , L AQI3 , · · · , L AQIn } (1)
in dyed fabric images is performed using the kernel
limit learning machine, and the kernel function parameters where LAQI1 , L AQI2 , L AQI3 , · · · , L AQIn represent the
are optimized by the improved Grasshopper optimization sub-index corresponding to each pollutant item. In this
algorithm to achieve color difference classification of dyed study, the calculation of AQI involves only six pollu-
fabric images. Xue et al. [33] proposed a GA-based air tants.Therefore, (1) can be expressed as:
quality prediction model to optimize the parameters of the
weighted extreme learning machine (WELM). Despite the AQI = max{IAQISO2 , IAQIN2 , IAQIO3 ,
progress made by the aforementioned methods, they also
exhibit limitations; their training efficiency is relatively low,
and deep learning algorithms are not yet fully mature. These
To meet our research goals, this paper proposes
challenges present greater obstacles for the application of
GA-KELM, a method for air quality prediction based on
deep learning, necessitating improvements to existing mod-
an improved extreme learning machine, which in turn is
els, the development of new models, and the enhancement of
based on an improved GA [40]. Aiming at the problem of
their predictive capabilities [34], [35].
network instability caused by the randomly generated input
The use of statistical or numerical forecasting techniques
layer weights and hidden layer thresholds of KELM, a GA is
is subject to several limitations. Neural networks are widely
used to optimize the KELM weights and thresholds, thereby
used because of their unique associative abilities, memory,
improving the model’s performance in terms of prediction
and distinctive learning [36], [37]. Given the highly nonlinear
accuracy, which is the main objective of this algorithm.
nature of AQI changes and the strong generalization and
In each iteration of the GA, a new offspring population
nonlinear characterization abilities of neural networks, the
is generated by selection, crossover, and mutation, and the
nuclear limit learning machine neural network model,
individual with good fitness value is selected. The GA stops
also known as kernel extreme learning machine (KELM),
iterating when the stopping criteria are satisfied. The GA is
is employed to investigate air quality prediction using a real
used to determine the optimal weights and threshold values,
dataset. The weights and threshold values of KELM are
which overcomes the instability of KELM and reduces
optimized using a genetic optimization algorithm [38].
prediction errors, thus resulting in a more reliable prediction
model and improved air quality prediction accuracy. Details
of the model will be discussed in the following sections.
In this section, AQI is first introduced, the ELM and KELM
algorithms are presented next, and a new GA-KELM learning
method for AQI prediction is then proposed. A. EXTREME LEARNING MACHINE
Air quality forecasting has been a key issue in early ELM was first proposed by Huang. It is characterized by
warning and control of urban air pollution. Its goal is to its fast training and high training accuracy. Feedforward
anticipate changes in the AQI value at observation points over neural networks are mainly based on the gradient descent
time. The observation period, which is decided by the ground- method [41]. Their main drawbacks are the slow training, the
based air-quality monitoring station, is usually set for one tendency to fall into a local minimum point that cannot reach
hour. the global optimum, and the high sensitivity to the learning
Furthermore, a location’s air quality value is largely influ- rate η (if the selected rate is improper, it might cause slow
enced by the weather conditions prevailing at that location. convergence and the training process will thus take a long
Air quality monitoring stations measure air temperature, time, or it becomes unstable). FIGURE 2 shows the network
wind direction, atmospheric pressure, relative humidity, wind structure of an ELM.
speed and other meteorological parameters, as well as air Consider N different samples (xi , ti ) , i ∈ 1, 2, · · · N ,
pollutant concentrations [39]. Air quality prediction is also where xi denotes the input and ti represents the target, L
challenging due to the rapid changes in pollutant emission hidden layer neurons, and an activation function g(x).The
and weather conditions. Numerous variables, such as wind mathematical expression for ELM output is:
speed, temperature, humidity, and pollutants themselves, are X
β j g wj xi + bj
highly nonlinear, dynamic, and have inherent interdependen- yi = (3)
cies, making it more challenging to accurately predict air
quality at a specific time and place. Therefore, it is essential where j ∈ 1, 2, · · · , L, wj are the weights of the input and
to figure out how to deal with these factors and exploit hidden layer neurons,bj are the thresholds of the hidden layer
them from multivariate time-series data related to air quality. neurons, βj is the weight matrix of neurons in the hidden and
A typical meteorological factor sequence diagram is shown output layers, and g wj xi + bj is the output of the hidden
in FIGURE 1. layer neurons.
TABLE 2. Category range of air pollutants (technical regulation on ambient air quality index, HJ633-2012).
error of AQI and the average relative error of major pollutants generation is set to 40, and the kernel function selected
is used as our fitness function. Then the GA is used to in KELM is the Gaussian RBF. The fitness curve of GA
optimize the fitness function, and the penalty factor and optimization is shown in Fig.8. It can be observed that the
kernel parameter are used as the optimization parameters. fitness keeps decreasing and finally reaches convergence.
According to previous theory, the standard GA will complete At this point, the optimal penalty factor and kernel function
convergence at an early stage. The maximum evolutionary parameters can be obtained.
TABLE 3. Comparison of RMSE of GA-KELM with other baseline to total predictions is estimated to measure the accuracy of
the model in predicting the correct AQI category.
The performance of the model in predicting air pollutant
concentrations is measured based on the RMSE, MSE
and R2 values of all predicted pollutants. The results are
compared with baseline methods for predicting air pollutant
concentrations, such as CMAQ, SVR, and DBN-BP, with
meteorological parameters as auxiliary inputs. However, the
TABLE 4. Comparison of MSE of GA-KELM with other baseline
approaches. factors considered in model evaluation are different.
The proposed and baseline models were initially trained
using the training dataset, and from Table3-5, we infer that
GA-KELM air has smaller RMSE and MSE values and
higher R2 values compared to baseline methods for predicting
concentrations of various air pollutants.The reduced RMSE
TABLE 5. Comparison of R2 of GA-KELM with other baseline approaches. and MSE values elucidate the reliability of the GA-KELM
air predictions, while the reduced RMSE and improved
R2 values infer the specificity of its mean predictions. The
superior performance of GA-KELM illustrates its efficiency
in accurately capturing spatiotemporal relationships and their
impact on predicted values.
