IoT Based Automated Weather Report Generation and Prediction Using Machine Learning-Unlocked
IoT Based Automated Weather Report Generation and Prediction Using Machine Learning-Unlocked
IoT Based Automated Weather Report Generation and Prediction Using Machine Learning-Unlocked
Abstract— The weather can have great impact on lives. Meteorological Departments (data center) for a specific
Weather changes can influence wide range of human activities location and conducting regression analysis on the dataset
and affect agriculture and transportation. The main aim of this [3].
paper is to monitor and report weather conditions so that one
is informed beforehand and necessary actions can be taken to Hence, a better and more accurate weather monitoring
reduce the damage by any calamity by forecasting it. Here we and prediction system is needed which could give the
are using various sensors in order to collect the data and estimates of the damage Caused by natural calamities and
previous data is used in order to train the system and with could also give a heads up before any natural disaster is
current data collection we do the prediction. We will be about to occur[4].
analyzing temperature, pollutants, humidity and pressure and
will predict the weather. Existing models are expensive in
II. PROPOSED METHODOLOGY
contrast to ours and hence it will make monitoring local area
feasible as it will be cheaper. The model will sense different weather parameters like
pressure, humidity, rainfall, temperature, dust particles and
light etc. and will display current day’s temperature on a web
Keywords—Internet of things, Regression, machine learning, portal. The model will also be able to predict the maximum
data collection through sensors and minimum temperatures of the next day using past three
days’ data as shown in figure 1.
I. INTRODUCTION It has a huge scope in today’s world as our current
Day to Day atmospheric variations can be demonstrated models are expensive and we are providing a cheaper
by the term weather. All the day to day activities like alternative. Also, all the predictions eye a large scale but
temperature change, humidity, cloud cover and precipitation with our model local area monitoring will be made possible.
can be illustrated by weather monitoring. When the weather
data is collected for a period of time, that weather statistics Though there are few weather reporting models which
can be used to categories the climate of that locality. The already exist but in today’s world except for weather,
assessment can be done on weekly, monthly or even on daily concentration of various components of air and its quality are
basis. So we can say that climate is a prolonged average of also gaining importance [5].
weather. Weather change plays a significant role in a lot of Till date there is no such model which can forecast
sectors like agriculture, food security etc. weather, determine various components in air and can check
Weather prediction is becoming more challenging in the or predict air quality. Existing models of weather forecasting
modern world. Due to climate shift natural disasters like has many mathematical calculations but they are not
cyclones, storms, earth quakes, tsunami etc. occur almost on designed appropriately in order to get higher classification
a daily basis and cause destruction on large scale. In these rate and it’s not feasible to get the exact local area weather
adverse times, correct rainfall prediction and weather because covering local areas with central area and then the
monitoring is very important and useful. The countries which analysis of weather makes the system inaccurate up to 18%.
are dependent on agriculture, like India, correct rainfall Previous weather forecasting models used the
prediction can be helpful in producing a bounty harvest [1]. complicated blend of mathematical instruments which was
In literary terms, there are many prediction models for insufficient to get higher classification rate and also covering
rainfall and weather forecasting. For yearly forecasting or for local areas was not feasible. Using different components to
large scale prediction these models do not provide accurate collect and assess data will be expensive [6].
results due to vigorous climate change [2]. Forecasting is a
very important analysis topic as it takes both science and A. Architecture of Proposed Model
technology to go hand in hand for correct results. Even Previous models were used to gather information about the
though there are some very accurate prediction systems but components used and researched about better and cheaper
there is no estimation of future damage that could be caused alternatives. All the information about the components
by the calamities. needed to develop the desired model (in figure 1.).
The estimate of damage can still be accounted and
predicted (in numerical terms) by collecting the dataset from
Authorized licensed use limited to: University of Exeter. Downloaded on May 06,2020 at 03:58:51 UTC from IEEE Xplore. Restrictions apply.
Post that, we analyzed the techniques previously used in C. System Requirements
an attempt to develop a similar model and came up with the The system/model should be able to sense the current
best approach. A platform to store all the generated data was temperature, light intensity, pressure, amount of gasses and
required. A portal having a decent UI was needed to retrieve display it on the LCD display. Alongside, it should also send
the data from the platform and display it. Alongside, a huge the data to the cloud from where it can be retrieved and
database was required to train the model [7]. displayed on a web portal. No only this the model should
also be able to predict the next day’s temperature [8].
Sensors are required to perform the sensing function and
sensed data can be showed on the display attached to
Arduino board. Internet connection will be required to send
the data to the cloud (Thingspeak) from it can be retrieved
using APIs [9].
D. Software Requirements
x Arduino IDE is used in order to code for Arduino
mega in figure 3.
E. Hardware Requirements
x Wi-Fi module (ESP 8266): ESP8266 is a chip that
provides the Wi-Fi and dual mode Bluetooth and it is
used in order to upload data on the cloud (in figure 4).
B. Proposed System
Once the need was identified, a model with 8 sensors, Fig. 4. Wi-Fi Module
display was devised. Machine learning (Linear Regression)
using MATLAB analysis was thought of as a desirable x Arduino Uno Mega: Arduino Mega 2560 is a
approach. Wixsite was considered for developing the portal microcontroller board that is based on the
as it contains thousands of attractive templates. Dataset of ATmega2560 (in figure 5). It is used to connect the
past two years was collected and was cleaned to obtain only sensors and obtain the analog input from different
relevant data (in figure 2). sensors and upload on the cloud using Wi-Fi module
[11-14].
340
Authorized licensed use limited to: University of Exeter. Downloaded on May 06,2020 at 03:58:51 UTC from IEEE Xplore. Restrictions apply.
LED emitter to be scattered towards the photo-
detector. More the particles are, more will be the
scattering of light and more will be the voltage (in
figure 9.
341
Authorized licensed use limited to: University of Exeter. Downloaded on May 06,2020 at 03:58:51 UTC from IEEE Xplore. Restrictions apply.
calculate the dew point of that time. After the dew point
calculation machine learning will be used to predict the
temperature of the following day.
A. Dataset Collection
Dataset was collected from Kaggle which was further
Fig. 12. MQ7 Sensor extracted from weather underground which is currently not
providing free service. The dataset provides the data from 1st
x DHT11: This sensor calculates yemperature and May 2016 to 11th March 2018. It gives a list of parameters:
humidity. It is less costly and reliable (in figure13) .
x Meantemp
x Meandewpoint
x Meanpressure
x Maxhumidity
x Minhumidity
x Maxtemp
x Mintemp
x Maxdewpoint
x Mindewpoint
Fig. 13. DHT11 Sensor x Maxpressure
x Minpressure
III. SYSTEM DESIGN a) Selection: Highlight all author and affiliation lines.
b) Change number of columns: Select the Columns
The SDLC model used here i.e. Iterative SDLC model icon from the MS Word Standard toolbar and then select the
(in figure.14)
correct number of columns from the selection palette.
c) Deletion: Delete the author and affiliation lines for
the extra authors.
B. Feature Selection
Number of parameters were increased to take into
account weather conditions of past three days. Analysis of
data was done and relevant parameters were identified. After
conducting the following analysis, the following parameters
were identified:
For max temperature prediction:
x Maxtempm_1
x Maxpressure_1
x Mintempm_3
Fig. 14. Iterative SDLC x Maxpressure_3
x Meanpressurem_3
A. User Interface Design x For min temperature prediction:
Interface has been designed on wixsite with little help of x Mintempm_1
CSS and JavaScript. It contains a soothing background with x Meantempm_3
temperatures displayed and website information posted on it. x Maxtempm_1
It also contains information about the various sensors used in x Maxtempm_2
the model [17]. x Maxdewptm_1
x Maxdewptm_3
B. Preliminary Product Description
x Meandewptm_1
This model collects data from various sensors such as
MQ135, MQ7, DHT11, BMP180 which gives us reading of x Meandewptm_2
different parameters in the surrounding like Temperature, x Meanpressure_1
Humidity, Pressure and Air quality. This data is then sent to x For mean temperature prediction:
the Thingspeak cloud using ESP 8266 Wi-Fi module. The x Meantemp_3
data gets stored in designated channels which is then used to x Meanpressure_3
342
Authorized licensed use limited to: University of Exeter. Downloaded on May 06,2020 at 03:58:51 UTC from IEEE Xplore. Restrictions apply.
x Maxtemp_1
x Maxtemp_2
x Mintemp_1
E. Technology Used
C. Result Analysis
a) Internet of Things: To collect data.
The parameter values collected from DHT11 sensor are
b) Machine learning (Linear Regression): To train data. stored on internet using cloud, which is used for further
c) Data Analytics: To analyze the predicted model. analysis using MATLAB. The data collected through sensors
can be viewed on a web portal (in figure 17, 18).
d) Cloud Computing: To upload data on thingspeak.
A. Testing
The model was tested multiple times. A few times it was Current Temperature Accuracy is 99.05%
left on the hostel terrace to sense the values and the same
Parameters Next day Next day
was performed at multiple locations as well. At first, system Next day Min
Max Mean
showed temperature value with an error of 1 degree Celsius Temperature
Temperatur Temperatur
but later was rectified. The same operation was performed at Accuracy
e Accuracy e Accuracy
hostel rooms to understand any discrepancy and the result Explained 0.94 0.93 0.95
was expected. variance
Mean
B. Maintenance 1.35 degree 1.29 degree 1.10 degree
absolute
Celsius Celsius Celsius
The model has been fitted in a box to accommodate all error
the components together and keep them integrated rather Median 0.90 degree
than drifting apart (in figure15, 16). Loose connections can 1.09 degree 0.97 degree
absolute Celsius
be rectified easily. Only continuous power supply and Celsius Celsius
error
internet is required to run the model for a long time.
343
Authorized licensed use limited to: University of Exeter. Downloaded on May 06,2020 at 03:58:51 UTC from IEEE Xplore. Restrictions apply.
REFERENCES
[1] M. Zhang, “Application of Data Mining Technology in Digital
Library”, Journal of Computers, vol. 6, no. 4, (2011) April, pp. 761-
768.
[2] Z. Danping and D. Jin, “The Data Mining of the Human Resources
Data Warehouse in University Based on Association Rule”, Journal of
Computers, vol. 6, no. 1, (2011) January, pp. 139-146.
[3] L. M. Saini and M. K. Soni, “Artificial neural network-based
peak load forecasting using conjugate gradient methods,” IEEE
Transactions on Power Systems, vol. 12, no. 3, pp. 907–912, .
2002.
Fig. 17. Website Screenshot displaying current weather [4] S. Fan, C. X. Mao, and L. N. Chen, “Peak load forecasting
using the self-organizing map,” in Advances in Neural Network-
ISNN 2005. New York: Springer-Verlag, 2005, pt. III, pp. 640–649.
[5] Kourentzes, N., “Intermittent demand forecasts with
neural networks”, International Journal of Production
Economics, Volume 143, Number 1, pages 198-206, 2013.
[6] Elia G. P., 2009, “A Decision Tree for Weather Prediction”,
Universitatea Petrol-Gaze din Ploiesti, Bd. Bucuresti 39, Ploiesti,
Catedra de Informatică, Vol. LXI, No. 1.
[7] A. R. Finamore; V. Calderaro; V. Galdi; A. Piccolo; G. Conio; S.
Grasso, “A day-ahead wind speed forecasting using data-mining
model– a feed-forward NN algorithm", IEEE International
Conference on Renewable Energy Research and Applications, 2015,
pp. 1230-1235.
[8] E. Erdem, J. Shi, “ARMA based approaches for forecasting the tuple
Fig. 18. Website Screenshot of weather prediction of wind speed and direction”, Applied Energy 88, ELSEVIER, 2011,
pp. 1405–1414.
Current day’s temperature and values from different [9] IOP Conf. Series: Journal of Physics: Conf. Series 910 (2017) 012020
doi :10.1088/1742-6596/910/1/012020.
sensors are displayed on the web portal. For prediction of
following day’s temperature, the parameters, through [10] N. Chen, Z. Qian, I. T. Nabney, and X. Meng, “Wind Power
Forecasts Using Gaussian Processes and Numerical Weather
correlation functions on the parameters and using P values Prediction,” IEEE Transaction on Power Systems, vol. 29, no. 2,
were sorted that is, from the existing dataset only relevant 2014.
fields for the project were picked up. Eventually, these [11] Martin T. H., Howard B. D, Mark B., 2002, Neural Network Design,
values were used to train the model and then it was able to Shanghai: Thomson Asia PTE LTD and China Machine Press.
predict the future values of temperature. The correlation [12] N. Chen, Z. Qian, I. T. Nabney, and X. Meng, “Wind Power
analysis is carried out on three parameters, they are; Forecasts Using Gaussian Processes and Numerical Weather
Temperature, Humidity and Pressure (In Table I). Prediction,” IEEE Transaction on Power Systems, vol. 29, no. 2,
2014.
The correlation between temperature and humidity is [13] R. R. B. de Aquino, H. T. V. Gouveia, M. M. S. Lira, A. A. Ferreira,
around 0.6-1 and -0.6 to -1. Hence the correlation between O.N. Neto, M. A. Carvalho Jr., “Wind Forecasting and Wind
those two parameters is good which shows that the existence Power Generation: Looking for the Best Model Based on Artificial
Intelligence”, IEEE World Congress on Computational Intelligence,
of pressure parameter doesn’t affect the value of temperature. 2012.
The correlation parameter chosen for is in the range 0.6-1 [14] Roger A. Pielke Sr., “Mesoscale Meteorological Modelling”,
and -0.6 to -1 and p value to segregate the parameters used is International Geophysics Series, Volume 98, oct, 2013.
0.05. The humidity and pressure are also correlated around [15] David J. Stensrud ,“Parameterization schemes – Keys To
0.2. Hence further only temperature and humidity parameters Understanding Numerical Weather Prediction Models”, Cambridge
are considered. University Press, 2007.
[16] A Parashar, A Parashar, S Goyal, B Sahjalan, “Push recovery for
The standard error indicates that the predicted value humanoid robot in dynamic environment and classifying the data
nearly plus or minus of error value. using K-mean”, Proceedings of the Second International Conference
on Information and Communication Technology for Competitive
Current Temperature Accuracy: Temperature shown Strategies.
contains an error of 0.5 degree Celsius. [17] A Parashar, A Parashar, S Goyal, “Classifying gait data using
different machine learning techniques and finding the optimum
D. Conclusion technique of classification” Information and Communication
Technology for Sustainable Development, 305-313.
Main aim is the integration of various sensors to collect [18] A Parashar, D Goyal, “Clustering Gait Data Using Different Machine
data using which temperature can be predicted. Here, more Learning Techniques and Finding the Best Technique” International
sensors are used to assess more parameters for high Conference on Smart Trends for Information Technology and
accuracy. Machine learning algorithms is used which will Computer Communications.
help in the prediction of temperature of the coming day [19] A Parashar, A Parashar, S Goyal, “Tracing Gesture and Extracting
based on the historical data hence aiming at higher accuracy Gait Feature to Recognize Parkinson’s Disease Using Multilayered
Back Propagation Algorithm” International Conference on Innovative
at lower cost. Also, it will the first setup which will monitor Technologies, IN-TECH 2017, 9
air quality along with forecasting weather and present it in [20] A Parashar, A Parashar, S Goyal, “Identification of gait data using
real time. machine learning technique to categories human locomotion,
Proceedings of the 10th International Conference on Security of
Information and Networks.
344
Authorized licensed use limited to: University of Exeter. Downloaded on May 06,2020 at 03:58:51 UTC from IEEE Xplore. Restrictions apply.