0% found this document useful (0 votes)
13 views

Rainfall prediction

Rain fall prediction using machine learning

Uploaded by

Being Fashion
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Rainfall prediction

Rain fall prediction using machine learning

Uploaded by

Being Fashion
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

1.

INTRODUCTION

Our project is about rainfall prediction. It predicts the rainfall in each state for all the months
from Jan to Dec. As global warming is increasing earth's temperature due to which our local regions
yearly rainfall patterns have been affecting This harms the farmers and other people who depend on
rainfall Proper water supply keeps farmland in a good condition For this rainfall prediction there are
many researches conducted using data mining and machine learning proper rainfall should be there
to in correct way to prevent flooding, drought, landslides, mass movements and avalanches Timely
and accurate forecasting can help reduce human and financial loss The main theme of this project is
to study and identify atmosphere that cause rainfall and to the intensity. It describes the relationship
between atmospheric variables that affect the rainfall Rainfall is a climate factor that affects many
human activities like agricultural production construction, power generation, forestry and tourism A
study is conducted and identified solar radiation, perceptible water vapour are important variables for
daily rainfall prediction This is using data driven machine learning algorithm but it is better to use
simple linear regression which has only one independent feature. The use of logistic regression
modelling has exploded during the past decade for prediction and forecasting. From its original
acceptance in epidemiologic research, the method is now commonly employed in almost all branches
of knowledge. Rainfall is one of the most important phenomena of the climate system. It is well
known that the variability and intensity of rainfall act on natural, agricultural, human and even total
biological systems. So it is essential to be able to predict rainfall by finding out the appropriate
predictors. In this paper an attempt has been made to use logistic regression for predicting rainfall. It
is evident that the climatic data are often subjected to gross recording errors though this problem
often goes unnoticed to the analysts. I have used very recent screening methods to check and correct
the climatic data that Use in our study. I have used fourteen years’ daily rainfall data to formulate our
model. Then Iuse two years’ observed daily rainfall data treating them as future data for the cross
validation of our model. Our findings clearly show that if able to choose appropriate predictors for
rainfall, logistic regression models can predict the rainfall very efficiently.

1
2. SYSTEM ANALYSIS

2.1 Existing System

The Existing system used a back propagation neural network for rainfall prediction. This model
was used by XianggenGan and he was tested using the dataset from 1970 to 2000 which has 16
meteorological parameters. During network training the target error is set as 0.01 and learning rate is
set as 0.01. This model is implemented on mat lab neural networks. Genetic Programming (GP) and
MCRP were compared on 21 different datasets of cities across Europe. Daily rainfall data for 10 years
were taken as training data and one year rainfall data were taken as testing data.

Disadvantages Of Existing System

➢ The disadvantage of MCRP is that it predicts accurately only for annual rainfall when
compared with monthly rainfall prediction.
➢ The assumptions which are made by the multiple linear regression are: linear relationship
between the both the descriptive and independent variables, the highly correlated variables are
independent variables, yi is calculated randomly.
➢ Weather is extremely difficult to forecast correctly.
➢ It is expensive to monitor-so many variables from so many sources.
➢ The computers needed to perform the millions of calculations necessary are expensive.
○ Algorithm: Markov-chain extended with rainfall prediction (MCRP),Genetic
Programming

2
2.2 Proposed System:

The proposed method is based on Linear regression. The data for the prediction is collected
from the publicly available sources and the 70 percent of the data is for training and the 30 percent of
the data is for testing. Linear regression is used to predict the values with the help of descriptive
variables and is a statistical method. It is having a linear relationship between the descriptive variable
and the output values. The number of observations is indicated by n. The dependent variable is yi and
the descriptive variable is xi. Β0 and βp are the constant y intercept and slope of the descriptive
variable respectively.

Advantages Of Proposed System:

➢ The error free prediction provides better planning in agriculture and other industries.
➢ The linear relationship between the both the descriptive and independent variables, the highly
correlated variables are independent variables, yi is calculated randomly and the mean and
variance are 0 and σ.
➢ The ability to determine the relative influence of one or more predictor variables to the
criterion value
➢ Ability to identify outliers or anomalies

3
2.3 Procedure To Solve The Given Problem

In this project Dogecoin price prediction and prediction, Iuse three approaches:
● ∙ Linear regression
● ∙ K-Nearest Neighbour
● ∙ Support Vector Machine
● ∙ Decision Tree

2.3.1 Linear Regression

Linear regression is a supervised machine learning method that is used by the Train Using
AutoML tool and finds a linear equation that best describes the correlation of the explanatory variables
with the dependent variable. This is achieved by fitting a line to the data using least squares. The line
tries to minimize the sum of the squares of the residuals. The residual is the distance between the line
and the actual value of the explanatory variable. Finding the line of best fit is an iterative process.

Advantages Of Linear Regression Algorithm:

● Linear regression performs exceptionally well for linearly separable data


● Easier to implement, interpret and efficient to train
● It handles overfitting pretty well using dimensionality reduction techniques, regularisation.
● One more advantage is the extrapolation beyond a specific data set

4
2.3.2 K-Nearest Neighbours

The k-nearest neighbours algorithm, also known as KNN or k-NN, is a non-parametric,


supervised learning classifier, which uses proximity to make classifications or predictions about the
grouping of an individual data point.

While it can be used for either regression or classification problems, it is typically used as a
classification algorithm, working off the assumption that similar points can be found near one another.

KNN Formula

5
2.3.3 Support Vector Machine

Support Vector Machine or SVM is one of the most popular Supervised Learning algorithms,
which is used for Classification as well as Regression problems. However, primarily, it is used for
Classification problems in Machine Learning.

The goal of the SVM algorithm is to create the best line or decision boundary that can
segregate n-dimensional space into classes so that Ican easily put the new data point in the correct
category in the future. This best decision boundary is called a hyperplane.

6
2.3.4 Decision Tree

Decision trees are a nonparametric supervised learning method used for classification and
regression. The deeper the tree, the more complex the decision rules and the fitter the model. Decision
tree uses the tree representation to solve the problem. In which each leaf node corresponds to a class
label and attributes are represented on the internal node of the tree. The primary challenge in the
decision tree implementation is to identify the attributes. There are two popular attribute selection
measures they are Entropy and Gini index. Entropy is the measure of uncertainty of a random variable,
it characterizes the impurity of an arbitrary collection of examples. The higher the entropy more the
information content

7
2.4 Motivation:

Rainfall prediction is a beneficiary one, but it is a challenging task. Machine learning


techniques can use computational methods and predict rainfall by retrieving and integrating the hidden
knowledge from the linear and non-linear patterns of past weather data. Various tools and methods for
predicting rain are currently available, but there is still a shortage of accurate results. Existing methods
are failing whenever massive datasets are used for rainfall prediction. This study provides efficient
rainfall prediction methods of machine learning techniques: random forest and logistic regression
methods that provide an easy and accurate prediction and determine which one is more effective in
comparison. This study would assist researchers in analysing the most recent work on rainfall
prediction with an emphasis on machine learning techniques and providing a reference for possible
guidance and comparisons.

8
3. SOFTWARE SPECIFICATION

3.1 Requirements Specification:

Requirement Specification provides a high secure storage to the web server efficiently.
Software requirements deal with software and hardware resources that need to be installed on a
server which provides optimal functioning for the application. These software and hardware
requirements need to be installed before the packages are installed. These are the most common set
of requirements defined by any operation system. These software and hardware requirements provide
a compatible support to the operation system in developing an application.

3.1.1 Hardware Requirements:

The hardware requirement specifies each interface of the software elements and the hardware
elements of the system. These hardware requirements include configuration characteristics.
● System : Pentium IV 2.4 GHz.
● Hard Disk : 100 GB.
● Monitor : 15 VGA Colour.
● Mouse : Logitech.
● RAM : 1 GB.

9
3.1.2 Software Requirements:

The software requirements specify the use of all required software products like data
management systems. The required software product specifies the numbers and version. Each interface
specifies the purpose of the interfacing software as related to this software product.

● Operating system : Windows XP/7/10


● Coding Language : Python 3.7
● Machine learning
● Tkinter
● Python packages
● Python
● Numpy, pandas, keras, sklearn, tkintertable, matplotlib, pillow, imutils
● RandomForest model,
● NLP model
● Dataset upload
● Pre-processing data
● Extracting dataset
● Splitting dataset training and testing
● Applying models

3.2 Functional Requirements

Random Forest Algorithm

In applications of machine learning for classification and regression, a Random Forest


Algorithm is a popular supervised method. Iare aware that there are a lot of trees in a forest, and the
more trees there are, the stronger the forest is.

MLP

A multilayer perceptron (MLP) is an artificial neural network that utilises feed forward to
produce a bunch of results from a bunch of information sources. The various layers of information
hubs that are coupled in a coordinated diagram between the info and result layers recognize a MLP.

10
Genetic Algorithm

The genetic calculation utilises regular choice, the system that prompts natural advancement, to
take care of restricted and unconstrained improvement issues. The genetic calculation over and over
changes a populace of individual arrangements. Qualities are an assortment of elements (factors) that
characterise an individual qualities are consolidated to frame a chromosome (arrangement). An
individual's set of genes is represented by a genetic algorithm as an alphabetic string. The use of binary
values (a string of 1s and 0s) is common.

11
3.3 Non Functional Requirements

All the other requirements which do not form a part of the above specification are categorised
as Non-Functional needs. A system perhaps needed to give the user a show of the quantity of records
during info. If the quantity must be updated in real time, the system architects should make sure that
the system is capable of changing the displayed record count at intervals associated with the tolerably
short interval of the quantity of records dynamic. Comfortable network information measure may
additionally be a non-functional requirement of a system.

The following are the features:

● Accessibility
● Availability
● Backup
● Certification
● Compliance
● Configuration Management
● Documentation
● Disaster Recovery
● Efficiency(resource consumption for given load)
● Interoperability

12
3.4 Performance Requirements

Performance is measured in terms of the output provided by the application. Requirement


specification plays an important part in the analysis of a system. Only when the requirement
specifications are properly given, it is possible to design a system, which will fit into the required
environment. It rests largely with the users of the existing system to give the requirement
specifications because they are the people who finally use the system. This is because the
requirements have to be known during the initial stages so that the system can be designed according
to those requirements.

The requirement specification for any system can be broadly stated as given below:

● The system should be able to interface with the existing system


● The system should be accurate
● The system should be better than the existing system
● The existing system is completely dependent on the user to perform all the duties.

13
3.5 Feasibility Study:

Preliminary investigation examines project feasibility; the likelihood the system will be useful
to the organisation. The main objective of the feasibility study is to test the Technical, Operational
and Economical feasibility for adding new modules and debugging old running systems. All systems
are feasible if they are given unlimited resources and infinite time. There are aspects in the feasibility
study portion of the preliminary investigation:

● Technical Feasibility
● Operation Feasibility
● Economical Feasibility

3.5.1 Technical Feasibility:

The technical issue usually raised during the feasibility stage of the investigation includes the

following:

● Does the necessary technology exist to do what is suggested?


● Does the proposed equipment have the technical capacity to hold the data required to use the
new system?
● Will the proposed system provide adequate response to inquiries, regardless of the number or
location of users?
● Can the system be upgraded if developed?
● Are there technical guarantees of accuracy, reliability, ease of access and data security?

3.5.2 Operational Feasibility

User-Friendly

Customers will use the forms for their various transactions i.e. for adding new routes, viewing
the route's details. Also the Customer wants the reports to view the various transactions based on the
constraints. These forms and reports are generated as user-friendly to the Client.

Reliability

The package will pick-up current transactions online. Regarding the old transactions, User will
enter them into the system.

14
Security

The web server and database server should be protected from hacking, virus etc

Portability

The application will be developed using standard open source software (Except Oracle) like
Java, tomcat web server, Internet Explorer Browser etc these software will work both on Windows
and Linux o/s. Hence portability problems will not arise.

Availability

This software will always be available.

Maintainability

The system uses the 2-tier architecture. The 1st tier is the GUI, which is said to be front-end
and the 2nd tier is the database, which uses My-Sql, which is the back-end.
The front-end can be run on different systems (clients). The database will be running at the server.
Users access these forms by using the user-ids and the passwords.

3.5.3 Economic Feasibility:

The computerised system takes care of the present existing system’s data flow and procedures
completely and should generate all the reports of the manual system besides a host of other
management reports.

It should be built as a web based application with separate web server and database server. This
is required as the activities are spread throughout the organisation and customers want a centralised
database. Further some of the linked transactions take place in different locations.

15
4. PROJECT DESCRIPTION

4.1 SDLC (Software Development Life Cycle)

4.1.1 Umbrella Model

Fig no. 4.1.1 Umbrella Model

SDLC is nothing but Software Development Life Cycle. It is a standard which is used by the
software industry to develop good software.

16
4.1.2 Requirements Gathering Stage

The requirements gathering process takes as its input the goals identified in the high-level
requirements section of the project plan. Each goal will be refined into a set of one or more
requirements. These requirements define the major functions of the intended application, define
operational data areas and reference data areas, and define the initial data entities. Major functions
include critical processes to be managed, as well as mission critical inputs, outputs and reports. A
user class hierarchy is developed and associated with these major functions, data areas, and data
entities. Each of these definitions is termed a Requirement. Requirements are identified by unique
requirement identifiers and, at minimum, contain a requirement title and textual description.

Fig no. 4.1.2 Requirement Gathering Model

17
These requirements are fully described in the primary deliverables for this stage: the
Requirements Document and the Requirements Traceability Matrix (RTM). The requirements
document contains complete descriptions of each requirement, including diagrams and references to
external documents as necessary. Note that detailed listings of database tables and fields are not
included in the requirements document.

The title of each requirement is also placed into the first version of the RTM, along with the
title of each goal from the project plan. The purpose of the RTM is to show that the product
components developed during each stage of the software development lifecycle are formally
connected to the components developed in prior stages.

In the requirements stage, the RTM consists of a list of high-level requirements, or goals, by
title, with a listing of associated requirements for each goal, listed by requirement title. In this
hierarchical listing, the RTM shows that each requirement developed during this stage is formally
linked to a specific product goal. In this format, each requirement can be traced to a specific product
goal, hence the term requirements traceability.

The outputs of the requirements definition stage include the requirements document, the RTM,
and an updated project plan.

Feasibility study is all about identification of problems in a project, the number of staff required
to handle a project is represented as Team Formation, in this case only modules with individual tasks
will be assigned to employees who are working for that project.
Project Specifications are all about representing various possible inputs submitting to the server and
corresponding outputs along with reports maintained by the administrator.

18
4.1.3 Analysis Stage

The planning stage establishes a bird's eye view of the intended software product, and uses this
to establish the basic project structure, evaluate feasibility and risks associated with the project, and
describe appropriate management and technical approaches.

Fig no. 4.1.3 Analysis stage

The most critical section of the project plan is a listing of high-level product requirements, also
referred to as goals. All of the software product requirements to be developed during the
requirements definition stage flow from one or more of these goals. The minimum information for
each goal consists of a title and textual description, although additional information and references to
external documents may be included. The outputs of the project planning stage are the configuration
management plan, the quality assurance plan, and the project plan and schedule, with a detailed
listing of scheduled activities for the upcoming Requirements stage, and high level estimates of effort
for the out stages.

19
4.1.4 Designing Stage

The design stage takes as its initial input the requirements identified in the approved
requirements document. For each requirement, a set of one or more design elements will be produced
as a result of interviews, workshops, and/or prototype efforts. Design elements describe the desired
software features in detail, and generally include functional hierarchy diagrams, screen layout
diagrams, tables of business rules, business process diagrams, pseudo code, and a complete
entity-relationship diagram with a full data dictionary.

These design elements are intended to describe the software in sufficient detail that skilled
programmers may develop the software with minimal additional input.

Fig no. 4.1.4 Designing stage

When the design document is finalized and accepted, the RTM is updated to show that each
design element is formally associated with a specific requirement. The outputs of the design stage are
the design document, an updated RTM, and an updated project plan.

20
4.1.5 Development (Coding) Stage

The development stage takes as its primary input the design elements described in the approved
design document. For each design element, a set of one or more software artefacts will be produced.

Fig no. 4.1.5 Coding stage

4.1.6 Integration & Test Stage

During the integration and test stage, the software artefacts, online help, and test data are
migrated from the development environment to a separate test environment. At this point, all test
cases are run to verify the correctness and completeness of the software. Successful execution of the
test suite confirms a robust and complete migration capability. During this stage, reference data is
finalized for production use and production users are identified and linked to their appropriate roles.
The final reference data (or links to reference data source files) and production user list are compiled
into the Production Initiation Plan.

21
4.1.7 Installation & Acceptance Test

During the installation and acceptance stage, the software artefacts, online help, and initial
production data are loaded onto the production server. At this point, all test cases are run to verify the
correctness and completeness of the software. Successful execution of the test suite is a prerequisite
to acceptance of the software by the customer.

After customer personnel have verified that the initial production data load is correct and the
test suite has been executed with satisfactory results, the customer formally accepts the delivery of
the software.

Fig no. 4.1.7 Installation

4.1.8 Maintenance

Outer rectangle represents maintenance of a project, Maintenance team will start with
requirement study, understanding of documentation, and later employees will be assigned work and
they will undergo training in that particular assigned category

22
5. SYSTEM DESIGN

5.1 Architecture

5.2 UML Diagrams

UML stands for Unified Modeling Language. UML is a standardised general-purpose


modelling language in the field of object-oriented software engineering. The standard is managed,
and was created by, the Object Management Group.

The goal is for UML to become a common language for creating models of object oriented
computer software. In its current form UML comprises two major components: a Meta-model and a
notation. In the future, some form of method or process may also be added to; or associated with,
UML.
23
The Unified Modeling Language is a standard language for specifying, Visualization,
Constructing and documenting the artefacts of software systems, as well as for business modelling
and other non-software systems.

The UML represents a collection of best engineering practices that have proven successful
in the modelling of large and complex systems.

The UML is a very important part of developing objects oriented software and the software
development process. The UML uses mostly graphical notations to express the design of software
projects.

5.2.1 Goals:

The Primary goals in the design of the UML are as follows:

1. Provide users a ready-to-use, expressive visual modelling Language so that they can develop
and exchange meaningful models.

2. Provide extendibility and specialisation mechanisms to extend the core concepts.

3. Be independent of particular programming languages and development processes.

4. Provide a formal basis for understanding the modelling language.

5. Encourage the growth of the OO tools market.

6. Support higher level development concepts such as collaborations, frameworks, patterns and
components.

7. Integrate best practices.

24
5.3 Use Case Diagram

A use case diagram in the Unified Modeling Language (UML) is a type of behavioural
diagram defined by and created from a Use-case analysis. Its purpose is to present a graphical
overview of the functionality provided by a system in terms of actors, their goals (represented as
use cases), and any dependencies between those use cases. The main purpose of a use case diagram
is to show what system functions are performed for which actor. Roles of the actors in the system
can be depicted

25
5.4 Class Diagram:
In software engineering, a class diagram in the Unified Modeling Language (UML) is
a type of static structure diagram that describes the structure of a system by showing the system's
classes, their attributes, operations (or methods), and the relationships among the classes. It explains
which class contains information.

26
5.5 Sequence Diagram:

A sequence diagram in Unified Modeling Language (UML) is a kind of interaction


diagram that shows how processes operate with one another and in what order. It is a construct of a
Message Sequence Chart. Sequence diagrams are sometimes called event diagrams, event scenarios,
and timing diagrams.

27
5.6 Activity Diagram

Activity diagrams are graphical representations of workflows of stepwise activities and actions
with support for choice, iteration and concurrency. In the Unified Modeling Language, activity
diagrams can be used to describe the business and operational step-by-step workflows of components
in a system. An activity diagram shows the overall flow of control.

28
6. SYSTEM TESTING

6.1 System Testing

Testing has become an integral part of any system or project especially in the field of
information technology. The importance of testing is a method of justifying, if one is ready to move
further, be it to check if one is capable to with stand the rigors of a particular situation cannot be
underplayed and that is why testing before development is so critical. When the software is
developed before it is given to the user the software must be tested whether it is solving the purpose
for which it is developed. This testing involves various types through which one can ensure the
software is reliable. The program was tested logically and patterns of execution of the program for a
set of data are repeated. Thus the code was exhaustively checked for all possible correct data and the
outcomes were also checked.

6.2 Module Testing

To locate errors, each module is tested individually. This enables us to detect errors and correct
them without affecting any other modules. Whenever the program is not satisfying the required
function, it must be corrected to get the required result. Thus all the modules are individually tested
from bottom up starting with the smallest and lowest modules and proceeding to the next level. Each
module in the system is tested separately. For example the job classification module is tested
separately. This module is tested with different job and its approximate execution time and the result
of the test is compared with the results that are prepared manually. Each module in the system is
tested separately. In this system the resource classification and job scheduling modules are tested
separately and their corresponding results are obtained which reduces the process waiting time.

6.3 Integration Testing

After the module testing, the integration testing is applied. When linking the modules there
may be a chance for errors to occur, these errors are corrected by using this testing. In this system all
modules are connected and tested. The testing results are very correct. Thus the mapping of jobs with
resources is done correctly by the system

29
6.4 Acceptance Testing

When that user finds no major problems with its accuracy, the system passers through a final
acceptance test. This test confirms that the system needs the original goals, objectives and
requirements established during analysis without actual execution which eliminates waste of time
and money acceptance tests on the shoulders of users and management, it is finally acceptable and
ready for the operation.

6.5 Test Cases

Test Test Case Test Case Test Steps Test Test


Case Name Desc. Step Expected Actual Case Priority
Id Status
01 Upload the Verify If dataset It cannot File is High High
tasks either file is not display the loaded
dataset is loaded uploaded file loaded which
or not message displays
task
waiting
time
02 Upload Verify If dataset It cannot It can low High
patients either is not display display
dataset dataset uploaded dataset dataset
loaded or reading reading
not process process
complete complete

03 Pre- Whether If not It cannot It can Medium High


processing preprocess applied display the display
ing on the necesary the
dataset data for necessary
applied or further data for
not process further
process

30
04 Prediction Whether If not Random Random High High
Random Prediction applied tree is not tree is
Forest algorithm generated generated
applied on
the data or
not

05 Recommend Whether If not It cannot It can High High


ation predicted displayed view view
data is predictio prediction
displayed n containin
or not containin g patient
g patient data
data
06 Noisy Whether If graph It does It shows Low Medium
Records the graph is not not show the
Chart is displayed the variations
displayed variations in
or not in between
between clean and
clean and noisy
noisy records
records

Table 6.5.1 Test Cases

31
7. CONCLUSION & FUTURE ENHANCEMENT

7.1 Conclusion

● There are some specific problems in the world that pushes the capability of data science and the
technology available in this field to their edge among them one is rainfall prediction
● I can easily conclude that for rainfall prediction this is the best way to use it by forming a range
of highest and lowest predicted values by adding bias in the model
● Rainfall prediction main objective is prediction of amount of rain in a specific well or division
by using various techniques and finding out which one is best Future scope of rainfall
prediction
● The future scope of rainfall prediction is very promising, with advancements in technology and
data analysis techniques. Some of the potential developments in this field include:
● Improvements in Data Collection
● Integration of Big Data
● Advances in Cloud Computing
● Development of Early Warning System.
● In summary, the future of rainfall prediction looks bright, and with continued research and
innovation, I can expect more accurate and reliable predictions that can help people and
communities prepare for extreme weather events.

32
7.2 Future Enhancement

It is not possible to develop a system that makes all the requirements of the user. User
requirements keep changing as the system is being used. Some of the future enhancements that can
be done to this system are:
● As the technology emerges, it is possible to upgrade the system and can be adaptable to the
desired environment.
● Based on the future security issues, security can be improved using emerging technologies like
single sign-on.

33
8. BIBLIOGRAPHY

1. Xiong, Lihua, and Kieran M. O'Connor. "An empirical method to improve the prediction limits
of the GLUE methodology in rainfall runoff modelling." *Journal of Hydrology* 349.1-2
(2008): 115-124.
2. Schmitz, G. H., and J. Cullmann. "PAI-OFF: A new proposal for online flood forecasting in
flash flood prone catchments." *Journal of Hydrology* 360.1-4 (2008): 1-14.
3. Riordan, Denis, and Bjarne K. Hansen. "A fuzzy case based system for weather prediction."
*Engineering Intelligent Systems for Electrical Engineering and Communications* 10.3
(2002): 139-146.
4. Guhathakurta, P. "Long-range monsoon rainfall prediction of 2005 for the districts and
sub-division Kerala with artificial neural network." *Current Science* 90.6 (2006): 773-779.
5. Pilgrim, D. H., T. G. Chapman, and D. G. Doran. "Problems of rainfall-runoff modelling in arid
and semiarid regions." *Hydrological Sciences Journal* 33.4 (1988): 379-400.
6. Lee, Sunyoung, Sungzoon Cho, and Patrick M. Wong. "Rainfall prediction using artificial
neural networks." *Journal of Geographic Information and Decision Analysis* 2.2 (1998):
233-242.
7. French, Mark N., Witold F. Krajewski, and Robert R. Cuykendall. "Rainfall forecasting in
space and time using a neural network." *Journal of Hydrology* 137.1-4 (1992): 1-31.
8. Charaniya, Nizar Ali, and Sanjay V. Dudul. "Committee of artificial neural networks for
monthly rainfall prediction using wavelet transform." *Business, Engineering and Industrial
Applications (IBERIA), 2011 International Conference on.* IEEE, 2011.
9. Noone, David, and Harvey Stern. "Verification of rainfall forecasts from the Australian Bureau
of Meteorology’s Global Assimilation and Prognosis (GASP) system." *Australian
Meteorological Magazine* 44.4 (1995): 275-286.
10. Hornik, Kurt, Maxwell Stinchcombe, and Halbert White. "Multilayer feedforward networks are
universal approximators." *Neural Networks* 2.5 (1989): 359-366.

34
APPENDIX

A) Sample Forms

● This dataset contains 641 number of rows and 19 columns each row has Indian state name
,district ,rainfall amount in cm/inches of each and every month from January to December
● As this dataset considered as supervised because it is labelled dataset to train algorithms that
to classify data or predict outcomes accurately
● JAN
● FEB
● MAR
● APR
● MAY
● JUNE
● JUL
● AUG
● SEP
● OCT
● NOV
● DEC
● JAN-FEB
● MAR-MAY
● JUN-SEP

35
36
37
38
39
40
B) Sample Code

{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"id": "AQGE6qwHiFM3"
},
"outputs": [],
"source": [
"import pandas as pd"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "FBu9HvXkk-L2"
},
"source": [
"# New Section"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ljCw5KfzadDZ"
},
"source": []
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"colab": {
41
"base_uri": "https://localhost:8080/"
},
"executionInfo": {
"elapsed": 1390,
"status": "ok",
"timestamp": 1679409641791,
"user": {
"displayName": "Soumika Reddy",
"userId": "05586591973105306191"
},
"user_tz": -330
},
"id": "HUHAGZulie8v",
"outputId": "735b2bb2-3ee8-4608-f039-8eee1fcca30f"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" STATE_UT_NAME DISTRICT JAN FEB MAR APR \\\n",
"0 ANDAMAN And NICOBAR ISLANDS NICOBAR 107.3 57.9 65.2 117.0 \n",
"1 ANDAMAN And NICOBAR ISLANDS SOUTH ANDAMAN 43.7 26.0 18.6 90.5 \n",
"2 ANDAMAN And NICOBAR ISLANDS N & M ANDAMAN 32.7 15.9 8.6 53.4 \n",
"3 ARUNACHAL PRADESH LOHIT 42.2 80.8 176.4 358.5 \n",
"4 ARUNACHAL PRADESH EAST SIANG 33.3 79.5 105.9 216.5 \n",
".. ... ... ... ... ... ... \n",
"636 KERALA IDUKKI 13.4 22.1 43.6 150.4 \n",
"637 KERALA KASARGOD 2.3 1.0 8.4 46.9 \n",
"638 KERALA PATHANAMTHITTA 19.8 45.2 73.9 184.9 \n",
"639 KERALA WAYANAD 4.8 8.3 17.5 83.3 \n",
"640 LAKSHADWEEP LAKSHADWEEP 20.8 14.7 11.8 48.9 \n",
"\n",
" MAY JUN JUL AUG SEP OCT NOV DEC ANNUAL Jan-Feb \\\n",
"0 358.5 295.5 285.0 271.9 354.8 326.0 315.2 250.9 2805.2 165.2 \n",
"1 374.4 457.2 421.3 423.1 455.6 301.2 275.8 128.3 3015.7 69.7 \n",
42
"2 343.6 503.3 465.4 460.9 454.8 276.1 198.6 100.0 2913.3 48.6 \n",
"3 306.4 447.0 660.1 427.8 313.6 167.1 34.1 29.8 3043.8 123.0 \n",
"4 323.0 738.3 990.9 711.2 568.0 206.9 29.5 31.7 4034.7 112.8 \n",
".. ... ... ... ... ... ... ... ... ... ... \n",
"636 232.6 651.6 788.9 527.3 308.4 343.2 172.9 48.1 3302.5 35.5 \n",
"637 217.6 999.6 1108.5 636.3 263.1 234.9 84.6 18.4 3621.6 3.3 \n",
"638 294.7 556.9 539.9 352.7 266.2 359.4 213.5 51.3 2958.4 65.0 \n",
"639 174.6 698.1 1110.4 592.9 230.7 213.1 93.6 25.8 3253.1 13.1 \n",
"640 171.7 330.2 287.7 217.5 163.1 157.1 117.7 58.8 1600.0 35.5 \n",
"\n",
" Mar-May Jun-Sep Oct-Dec \n",
"0 540.7 1207.2 892.1 \n",
"1 483.5 1757.2 705.3 \n",
"2 405.6 1884.4 574.7 \n",
"3 841.3 1848.5 231.0 \n",
"4 645.4 3008.4 268.1 \n",
".. ... ... ... \n",
"636 426.6 2276.2 564.2 \n",
"637 272.9 3007.5 337.9 \n",
"638 553.5 1715.7 624.2 \n",
"639 275.4 2632.1 332.5 \n",
"640 232.4 998.5 333.6 \n",
"\n",
"[641 rows x 19 columns]\n"
]
}
],
"source": [
"import pandas as pd\n",
"d=pd.read_csv('district.csv')\n",
"print(d)"
]
},
{
"cell_type": "code",
"execution_count": 3,
43
"metadata": {
"id": "Pk3uVvCnit9l"
},
"outputs": [],
"source": [
"from matplotlib import pyplot as pt"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 282
},
"executionInfo": {
"elapsed": 29,
"status": "ok",
"timestamp": 1679409641792,
"user": {
"displayName": "Soumika Reddy",
"userId": "05586591973105306191"
},
"user_tz": -330
},
"id": "5bKuGKcEi3iR",
"outputId": "501a49d3-59a6-4458-b834-0798e0927473"
},
"outputs": [
{
"data": {
"text/plain": [
"[<matplotlib.lines.Line2D at 0x19fe54ea520>]"
]
},
44
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png":
"text/plain": [
"<Figure size 640x480 with 2 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"import seaborn as sn\n",
"import matplotlib.pyplot as plt\n",
"sn.heatmap(cov_matrix,annot=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "kdOz-W8Vkx94"
},
"outputs": [],
"source": []
}
],
"metadata": {
"colab": {
"provenance": [
{
"file_id": "1H_D1oEGl3hNyZhhz7mvGmZuksWQLZYJD",
45
"timestamp": 1675063582832
}
],
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.13"
}
},
"nbformat": 4,
"nbformat_minor": 1
}

46

You might also like