0% found this document useful (0 votes)
106 views8 pages

Air Quality Prediction

The document discusses using an LSTM algorithm and Arduino device to predict air quality in real-time. It collects air quality data from sensors, feeds it to the LSTM model for training, and displays predictions on an LCD screen. The system can accurately predict AQI values and be applied to environmental monitoring, public health, and urban planning. It also reviews several related studies on air quality prediction using machine learning techniques.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
106 views8 pages

Air Quality Prediction

The document discusses using an LSTM algorithm and Arduino device to predict air quality in real-time. It collects air quality data from sensors, feeds it to the LSTM model for training, and displays predictions on an LCD screen. The system can accurately predict AQI values and be applied to environmental monitoring, public health, and urban planning. It also reviews several related studies on air quality prediction using machine learning techniques.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Air Quality Prediction Using Variant LSTM

Algorithm And Arduino


Abstract: We have seen the impact of air pollution, which is the
applications, including air quality monitoring, pollution control, and
most important factor to consider in order to survive in today's public health and safety. By providing real-time air quality
world. Asthma , Cancer, and Heart Attack are some of the predictions, this system can help individuals and organizations make
respiratory diseases caused by breathing polluted air. This is where informed decisions about air quality and take action to improve it.
air quality monitoring and prediction has become critical. Many
scientists have conducted numerous studies but have yet to draw II. LITERATURE REVIEW
conclusions from them. We have many Machine Learning 1) In Hangzhou, Haiping Lin, Zhongjie Fu, Jiana Yao, and
Algorithms, but in this case we use the LSTM algorithm, which is Bingqiang Huang conducted research on air quality prediction
well-known for time series data and long-term dependency. by making use of an machine learning model. In this project
Previously, a large number of studies have been conducted using they studied on The Bayesian network model is employed in
this LSTM Algorithm, and they appear to be on the back foot by this study to forecast Hangzhou's air quality. The Bayesian
considering fewer pollutants for their prediction, where their work network model is formed after the several pollutants like SO2,
can be extended. To address this issue, the same algorithm is being O3, NO, PM variants and many more are employed as the
considered with all eight pollutants and is being sent to the LSTM model's evaluation criteria. The model's output is the AQI
algorithm. This LSTM algorithm will take care of Temperature value. The measure used to objectively describe the state of the
Factors and all eight Pollutants of time series data that we have air is called the air quality index. The risk to human health
collected from the Datasets available from Kaggle. This project increases as the value of this categories increases, indicating
proposes a method for predicting air quality using a Long Short- how serious by the pollutants like PM is one of the variables
Term Memory (LSTM) algorithm and an Arduino microcontroller. that affects the ambient air quality index.
The system collects real-time air quality data from various sensors,
processes it using the LSTM algorithm, and predicts the air quality 2) "Air Quality Prediction and Analysing in india by using the
index (AQI) for a given location and time. The LSTM model is Machine learning " Contributed by Mrs. J. Gnana Jeslin , Mrs.
trained on historical data from the same location and evaluated using A. Gnana Soundari and Akshaya A.C. They employed machine
various performance metrics. The predicted AQI values are learning and forecasting in India to evaluate the air quality
displayed on an LCD screen connected to the Arduino board. This index in a specific location. In India, the air quality index is a
project offers an efficient and cost-effective solution for real-time frequently used indicator of pollution levels across time. Using
air quality monitoring that can be applied to a range of use cases, past data from the previous years and projecting over a certain
including environmental monitoring, public health, and urban fore coming years as
planning. Results show that the LSTM algorithm can predict AQI
values with high accuracy, achieving a mean absolute error (MAE)
of less than 10.
a RegressionProblem, a model was constructed to predict the air
quality index. For Prediction Problem, the cost of estimation will
Index Terms- Air quality prediction, LSTM algorithm, be helpful in increase the model's efficiency. This model has a
Arduino microcontroller, real-time monitoring, AQI. 96% accuracy rate when they predict the air quality index for all
of India. Additionally, AHP MCDM is a technique that can be
I. INTRODUCTION applicable to determine the order of preference based on how
closely it matches the ideal solution.
Air quality is a critical aspect of our environment and it has a direct
impact on our health and well-being. As the world becomes 3) Researchers such as Limei Ma, Yijun Gao, and CHEN
increasingly industrialized, air pollution levels are rising, and it's Zhao conducted a series of tests on "Air quality
becoming more important to monitor and predict air quality in prediction using SPSS and Machine Learning. “All
real-time. In this regard, the Air Quality Prediction Using LSTM they accomplished was accurate prediction, but our
Algorithm and Arduino is a cutting-edge system designed to study will provide accurate prevention. Moving on to
tackle this problem. this chapter, SPSS is a phenomenon with several
elements. It is more practical and successful in
The system uses a combination of an LSTM (Long Short-Term predicting dependent variables than estimating the
Memory) algorithm and an Arduino device to accurately predict single independent variable. Multivariate Linear
air quality in real-time. The LSTM algorithm is a deep learning Regression is concerned with the regression of two or
model that is specifically designed to handle time-series data, like more independent variables.
air quality. The Arduino device is equipped with sensors that
4) "A Review over an Air Quality Prediction by Using various
measure different air quality parameters, such as temperature,
Machine Learning Techniques" by R. K. Gupta et al. (2021)
humidity, and pollutants. The data collected by the sensors is then
This article offers an updated overview of the most recent
fed into the LSTM algorithm, which uses it to make predictions
machine learning methods for forecasting air quality. The writers
about the current and future air quality.
go into the benefits and drawbacks of several algorithms used
such as support vector machines, random forests, and deep
The Air Quality Prediction Using LSTM Algorithm and Arduino learning. Additionally, they emphasize how crucial good data
system is highly versatile and can be used in a variety of quality and pre-processing are to raising the prediction model
accuracy.
conventional statistical methods, are covered in this study.
5) " Air Quality Prediction Models: From Conventional They address the possibilities for employing interpretability
Techniques to Machine Learning and Deep Learning" by H. techniques to increase the transparency of predictive models
A. Almalki et al. (2021) and emphasize the significance of using
It contains the most recent air quality prediction models, using The most recent methods for the predicting the air quality,
both conventional statistical methods and machine learning and including many of the machine learning and also the deep
the deep learning algorithms, is provided in review article. The learning algorithms as well as conventional statistical methods,
authors explore the difficulties involved in forecasting air are covered in this study. They address the possibilities for
quality in intricate metropolitan areas and highlight the employing interpretability techniques to increase the
possibility for merging various models and data sources to transparency of predictive models and emphasize the
create more precise and reliable predictive models. significance of using accurate data sources and incorporating
domain knowledge into predictive model
6) "Air Quality Prediction with Artificial Intelligence combining ensemble models and to improve the performance
Techniques" by A. M. Mohamed et al. (2021) of predictive model this can be transferred, as well as the
The application of artificial intelligence methods for predicting necessity of data pretreatment and feature selection for creating
air quality, such as machine learning and also the deep learning, predictive models with improved accuracy.
main topic of this review article. authors explore the benefits
and drawbacks of various algorithms and emphasize how 10) " Air Quality Prediction Techniques and Their Applications:
crucial it is to choose the right input variables and optimize From Traditional Methods to Deep Learning" by S. M. M.
hyperparameters in order to create precise predictive models. Islam et al. (2021)
The most recent methods for the predicting the air quality,
including many of the machine learning and also the deep
7) "A paper on Recent Advances in Air Quality Prediction by learning algorithms as well as conventional statistical methods,
using various Machine Learning Techniques" by S. Mehta et are covered in this study. They address the possibilities for
al. (2022) This paper summarizes the most recent employing interpretability techniques to increase the
developments on machine learning methods for predicting air transparency of predictive models and emphasize the
quality, such as deep learning, support vector regression, and significance of using accurate data sources and incorporating
random forestregression. The authors address the possibility domain knowledge into predictive model
for adopting explainable AI to improve the interpretability
of predictive models and emphasize the value of
incorporating data from many sources, such as satellite data
and social media.

8) " Air Quality Prediction: From the Mathametical Models to


Machine Learning and Deep Learning Techniques" by S.
Jafari et al. (2021)
In addition to statistical models, this work also discusses
machine learning and the deep learning methods used where
to predict air quality. The authors stress the potential of
combining ensemble models and to improve the performance
of predictive model this can be transferred, as well as the
necessity of data pretreatment and feature selection for
creating predictive models with improved accuracy.

9) " Air Quality Prediction Techniques and Their Applications:


From Traditional Methods to Deep Learning" by S. M. M.
Islam et al. (2021)The most recent methods for the
predicting the air quality, including many of the machine
learning and also the deep learning algorithms as well as
Normalization
In this process we will scale the data in to specific range AQI
is the number or percentage to communicate with the public or
either for government to specify how much the air has been
III. METHODOLOGY pullulated in that location

Formatting the file


Converting the file from one form to another form basically
from xlxs format to CSV format for uploading the file for the
training

Training the Model


Train the LSTM model by using the training data by updating
the model's weights and biases to reduce the error percentage
between the predicted values and its actual values. This can be
done using an optimization algorithm like stochastic gradient
descent.

Model evaluation
Evaluating performance of the model on test data to
determine how well it generalizes to new, unseen data. You can
use metrics like mean squared error to measure the model
accuracy.

Model deployment
Finally, deploy the trained model on the Arduino platform,
where it can use the data from the sensors to make predictions
in real- time.

IVALGORITHM
LSTM (Long Short-Term Memory) one of the type of recurrent
neural network (RNN) architecture which is designed to handle the
long-term dependencies between input sequences. All the models
Belongs to LSTM are Helpful for Processing and as well as
Prediction of timeseries data such as speech, text and video etc.
At a high level, an LSTM consists of three main components: an
input gate, a forget gate, and an output gate. Each of these gates is
implemented using a sigmoid activation function and controls the
flow of the through the LSTM.

Figure. 1: Device design The input gate determines which parts of the input sequence should be
used to update the LSTM's internal state.The part of the internal state
Figure 1 It indicates the collection of air quality with multiple should be forgotten that is decided by the forget gate.The output gate
parameters from the desired locations and by sensors like determines which parts of the internal state should be used to produce
PM10,PM2.5,Carbon monoxide sensor, Wind wave and wind speed the output sequence.
and temperature sensor and many more the data collected from Having this The cell in LSTM is capable of carrying information
multiple sensors will send the data to Arduino microcontroller, It across the time. The cell state will be updated using a combination of
will send the datato the local server like PC/laptops we have to pre- input gate and forget gate, and can be thought of as a "memory" of
process the data before sending the data to the training model the the LSTM.
data pre- processing includes Cleaning and Normalization and The equations that govern by the behavior of an LSTM are as follows:
formatting the file A_t = σ(Xi + U_I H_{T-1} + B)
B_T = σ(W X + U_F H_{T-1}
B_D_T = σ(WX_T + U)
C = tanh(W Tx + U{t-1} + Cb)
Cleaning the data E = F_T * c_{t-1} + I* G_T = O_T * tanh(tc)
It is customary to have missing values in the any dataset ,It will
happen basically while collecting data from the different resources Here, I, F,T, and represent gates output ,input and forget gates The
to overcome this problem the rows with missing data is eliminated subscript t represents the current time step, and H_{t-1} represents the
And converted in to the numerical model for the easy understanding output the previous time step. i,,f, W, and C are weight matrices for
the input, forget, output, and candidate gate respectively, weight
matrices for the recurrent connections. i, f, o, and c are bias terms. as:
During training process, the weights and biases of an LSTM are t = sigmoid(dw* [h_{t-1}] + b)
updated using backpropagation through time (BPTT), which
involves calculating the gradients with respective of the parameters fw is a weight matrix, f to be a bias vector, and the [h_{t- 1}, x_t]
of the LSTM Algorithm. notation indicates addition of previous hidden state data and current
Overall, LSTMs are a powerful tool for processing and predicting input data.
time-series data, thanks to their capacity of acquiring the long term
dependencies and remember important information across multiple The output gate (d_t) is used to control the current cell state (C)
time steps. should be outputted at each time step. It is defined as:
d = sigmoid(O * [h{t-1}, x] + B)
V. MATHEMATICAL MODEL
where Ow is a Mass of the matrix, Ob is a bias of the vector, and the
[H_{t- 1}, X] notation indicates concatenation of the previous
Mathematical model of an LSTM algorithm hidden state and the current input.
The cell state (C) is updated Considering the input and forget gates ,
Long Short-Term Memory (LSTM) is a type of recurrent neural and the candidate cell state (C_tilde_t) which is calculated as:
network (RNN) architecture that can be representedmathematically. C_tilde_T = tanh(wc * [h{t-1}, X_T] + Cb)

At a high level, it consists of three gates: the input and forget gate, where cw is a weight matrix, cb is a bias vector, and the [H_{t- 1},
and also the output gate. Each gate is asigmoid function that takes X_t] notation indicates addition of previous data which hidden and
as input will be the previous hidden state and the current input (x), the current input.The updated cell state (tc) is computed as:
and outputs a value between 0 and tc = ft * h{t-1}a_t * C_tilde_t
1. These gate values manages l the flow of the information through
the different cells. The hidden state (th) is computed as:
th= to* tanh(C_t)
The input gate which determines the information must be stored
in the cell state (S_T). It is represented mathematically by :I_T =
sigmoid(W[H_t-1, Tx] + Ib) This output will be used for further processing or as output for the
current time step.
fForget gate determines the which information should be discarded
from the cell state. mathematically as follows: LSTM algorithm uses a set of mathematical equations to determine
Tf= sigmoid(WF*[H_t, tx] + fb) which information to store, forget, and output at each time step,
Whatever the information that is coming out from the from the depends on the previous hidden state and by the current input data .
cell will be determined by the output get. It is represented This allows it to extracts the long term dependencies to selectively
mathematically as follows: remember or forget information over time, making it suited well to
to = sigmoid(ow * [H_t-1, X_T] + ob) the tasks like language modelling, and time series analysis.

The current input (X) is combined by the previous hidden state VI. RESULTS AND DISCUSSIONS
(H_t-1) to produce a candidate cell state (C_tilde_t), which is The plotted the actual true values (first plot) and predicted values
thenupdated used by the input gate and the forget gate. The (second plot). One can visually see that the distribution is almost the
updated cellstate (S_t) is then passed through the output gate same. This says that our predictions are very accurate.
to produce the current hidden state (h_t), which is the output
of the LSTM cell.These operations are represented
mathematically as follows: S_tilde_t = tanh(sw * [H_t-1,X_T] +
cb)
Ts= F * C_t-1 + I * C_tilde_t
Th = T * tanh(S)

Where W o a,c,d,weight matrices, and b of ,f,o,c are bias vectors


which are learned during the training process.

The input gate (i) is used to control quantity of the new input (x)
should be added to the cell state (C) at each time step. It is defined
as:
a = sigmoid(W_a * [h_{t-1}, x] + i)

a is a weight matrix, i is a bias vector, and the [h_{t- 1}, x_t] Fig 6.1 actual value graph
notation indicates the concatenation over previous hidden state
(h_{t-1}) and the current input (x).
forget gate (f) which is used to control how much of the previous
state of (c{t-1}) must be retained at each time step. It is defined
Fig 6.5 Accuracy of NO2
Computed model accuracy for CO
Fig 6.2 Predicted value graph

Fig 6.3 Accuracy of the model

Computed graphs of model accuracy with different parameters


Fig6.6 Accuracy of CO
Computed model accuracy for the parameter SO
Computed model accuracy for O3

Fig 6.4 Accuracy of SO

Computed parameter for the model parameterNO2


Fig 6.7 Accuracy of O3
VII COMPARISION TABLE OF DIFFERENT Furthermore, the use of low-cost Arduino-based sensor networks can
MODELS provide a cost-effective solution for air quality monitoring, especially
in developing countries where air pollution is a significant issue, and
Here we are compared different metrics like MSE,RMSEand expensive monitoring equipment may not be available.
MAPE and other different metrics with different LSTM algorithm
metrics values to know which algorithm is performing best and Overall, the integration of machine learning and Arduino in air quality
finally we observed that Stacked LSTM model giving better prediction is a promising approach that can help address the
accuracy when compared to other models .However performance challenges associated with air pollution and provides the better
of model depends up in the task performing and data sets used on understanding of impact of air pollution on human health
that model
X. REFERENCES
[1] “Air pollution: how it going to affects our health,” European
Environment Agency, 03-Dec-2019. [Online]. Available:
https://www.eea.europa.eu/themes/air/health-impacts-of-air-
pollution .

[2] . Mrs. A. Gnana Soundari Mrs. J. Gnana Jeslin, Akshaya A.C


“INDIAN AIR QUALITY PREDICTION AND THE
ANALYSIS USING BY THE MACHINE LEARNING”

[3] Limei “The Research being Made on The Air Quality Index
which is based on SPSS” by Limei Ma,Yijun Gao,Chen Zhao.

[4] A Review on the Air Quality Prediction by Using the


Machine Learning Techniques: From Traditional Methods to
DeepLearning" by X. Wang et al. (2021)
[5] hu Wang, Yuhuang Hu, Javier Burgues´ Santiago Marco
and Shih-Chii Liu, Prediction of Gas Concentration Using
Gated Recurrent Neural Networks, 2020, IEEE

[6] Haotian Jing & Yingchun Wang, Research on The Urban


Air Quality Prediction Based on the Ensemble Learning
of XGBoost, 2020, E3S Web of Conferences

[7] Venkat Rao Pasupuleti , Uhasri , Pavan Kalyan , Srikanth


and Hari Kiran Reddy, Air Quality Prediction Of Data Log
By Machine Learning,2020,IEEE

[8] "A Review of Air Quality Prediction Models: From


Conventional Techniques to Machine Learning and Deep
Learning" by H. A. Almalki et al. (2021)

Tab 7.1Comparison table of various


models [9] "A Review on Air Quality Prediction Techniques and Their
Applications: From Traditional Methods to Deep Learning" by
VIII. CONCLUSION S. M. M. Islam et al. (2021)

In conclusion, the use of machine learning and Arduino in air [10] "A Review of Air Quality Prediction with Artificial
quality prediction has shown promising results in providing Intelligence Techniques" by A. M. Mohamed et al. (2021)
accurate and real-time measurements of air quality. Machine
learning algorithms like linear regression, decision trees, and the
neural networks methods has been used estimation of air quality
parameters such as PM2.5, CO, and NO2, with high accuracy.
By integrating machine learning algorithms with an Arduino- based
sensor network, air quality monitoring can be done in real- time,
providing continuous updates on the air quality status. This can be
useful for both individuals and governments to take necessary
actions to reduce pollution levels and promote public health.

You might also like