Rainfall prediction
Rainfall prediction
INTRODUCTION
Our project is about rainfall prediction. It predicts the rainfall in each state for all the months
from Jan to Dec. As global warming is increasing earth's temperature due to which our local regions
yearly rainfall patterns have been affecting This harms the farmers and other people who depend on
rainfall Proper water supply keeps farmland in a good condition For this rainfall prediction there are
many researches conducted using data mining and machine learning proper rainfall should be there
to in correct way to prevent flooding, drought, landslides, mass movements and avalanches Timely
and accurate forecasting can help reduce human and financial loss The main theme of this project is
to study and identify atmosphere that cause rainfall and to the intensity. It describes the relationship
between atmospheric variables that affect the rainfall Rainfall is a climate factor that affects many
human activities like agricultural production construction, power generation, forestry and tourism A
study is conducted and identified solar radiation, perceptible water vapour are important variables for
daily rainfall prediction This is using data driven machine learning algorithm but it is better to use
simple linear regression which has only one independent feature. The use of logistic regression
modelling has exploded during the past decade for prediction and forecasting. From its original
acceptance in epidemiologic research, the method is now commonly employed in almost all branches
of knowledge. Rainfall is one of the most important phenomena of the climate system. It is well
known that the variability and intensity of rainfall act on natural, agricultural, human and even total
biological systems. So it is essential to be able to predict rainfall by finding out the appropriate
predictors. In this paper an attempt has been made to use logistic regression for predicting rainfall. It
is evident that the climatic data are often subjected to gross recording errors though this problem
often goes unnoticed to the analysts. I have used very recent screening methods to check and correct
the climatic data that Use in our study. I have used fourteen years’ daily rainfall data to formulate our
model. Then Iuse two years’ observed daily rainfall data treating them as future data for the cross
validation of our model. Our findings clearly show that if able to choose appropriate predictors for
rainfall, logistic regression models can predict the rainfall very efficiently.
1
2. SYSTEM ANALYSIS
The Existing system used a back propagation neural network for rainfall prediction. This model
was used by XianggenGan and he was tested using the dataset from 1970 to 2000 which has 16
meteorological parameters. During network training the target error is set as 0.01 and learning rate is
set as 0.01. This model is implemented on mat lab neural networks. Genetic Programming (GP) and
MCRP were compared on 21 different datasets of cities across Europe. Daily rainfall data for 10 years
were taken as training data and one year rainfall data were taken as testing data.
➢ The disadvantage of MCRP is that it predicts accurately only for annual rainfall when
compared with monthly rainfall prediction.
➢ The assumptions which are made by the multiple linear regression are: linear relationship
between the both the descriptive and independent variables, the highly correlated variables are
independent variables, yi is calculated randomly.
➢ Weather is extremely difficult to forecast correctly.
➢ It is expensive to monitor-so many variables from so many sources.
➢ The computers needed to perform the millions of calculations necessary are expensive.
○ Algorithm: Markov-chain extended with rainfall prediction (MCRP),Genetic
Programming
2
2.2 Proposed System:
The proposed method is based on Linear regression. The data for the prediction is collected
from the publicly available sources and the 70 percent of the data is for training and the 30 percent of
the data is for testing. Linear regression is used to predict the values with the help of descriptive
variables and is a statistical method. It is having a linear relationship between the descriptive variable
and the output values. The number of observations is indicated by n. The dependent variable is yi and
the descriptive variable is xi. Β0 and βp are the constant y intercept and slope of the descriptive
variable respectively.
➢ The error free prediction provides better planning in agriculture and other industries.
➢ The linear relationship between the both the descriptive and independent variables, the highly
correlated variables are independent variables, yi is calculated randomly and the mean and
variance are 0 and σ.
➢ The ability to determine the relative influence of one or more predictor variables to the
criterion value
➢ Ability to identify outliers or anomalies
3
2.3 Procedure To Solve The Given Problem
In this project Dogecoin price prediction and prediction, Iuse three approaches:
● ∙ Linear regression
● ∙ K-Nearest Neighbour
● ∙ Support Vector Machine
● ∙ Decision Tree
Linear regression is a supervised machine learning method that is used by the Train Using
AutoML tool and finds a linear equation that best describes the correlation of the explanatory variables
with the dependent variable. This is achieved by fitting a line to the data using least squares. The line
tries to minimize the sum of the squares of the residuals. The residual is the distance between the line
and the actual value of the explanatory variable. Finding the line of best fit is an iterative process.
4
2.3.2 K-Nearest Neighbours
While it can be used for either regression or classification problems, it is typically used as a
classification algorithm, working off the assumption that similar points can be found near one another.
KNN Formula
5
2.3.3 Support Vector Machine
Support Vector Machine or SVM is one of the most popular Supervised Learning algorithms,
which is used for Classification as well as Regression problems. However, primarily, it is used for
Classification problems in Machine Learning.
The goal of the SVM algorithm is to create the best line or decision boundary that can
segregate n-dimensional space into classes so that Ican easily put the new data point in the correct
category in the future. This best decision boundary is called a hyperplane.
6
2.3.4 Decision Tree
Decision trees are a nonparametric supervised learning method used for classification and
regression. The deeper the tree, the more complex the decision rules and the fitter the model. Decision
tree uses the tree representation to solve the problem. In which each leaf node corresponds to a class
label and attributes are represented on the internal node of the tree. The primary challenge in the
decision tree implementation is to identify the attributes. There are two popular attribute selection
measures they are Entropy and Gini index. Entropy is the measure of uncertainty of a random variable,
it characterizes the impurity of an arbitrary collection of examples. The higher the entropy more the
information content
7
2.4 Motivation:
8
3. SOFTWARE SPECIFICATION
Requirement Specification provides a high secure storage to the web server efficiently.
Software requirements deal with software and hardware resources that need to be installed on a
server which provides optimal functioning for the application. These software and hardware
requirements need to be installed before the packages are installed. These are the most common set
of requirements defined by any operation system. These software and hardware requirements provide
a compatible support to the operation system in developing an application.
The hardware requirement specifies each interface of the software elements and the hardware
elements of the system. These hardware requirements include configuration characteristics.
● System : Pentium IV 2.4 GHz.
● Hard Disk : 100 GB.
● Monitor : 15 VGA Colour.
● Mouse : Logitech.
● RAM : 1 GB.
9
3.1.2 Software Requirements:
The software requirements specify the use of all required software products like data
management systems. The required software product specifies the numbers and version. Each interface
specifies the purpose of the interfacing software as related to this software product.
MLP
A multilayer perceptron (MLP) is an artificial neural network that utilises feed forward to
produce a bunch of results from a bunch of information sources. The various layers of information
hubs that are coupled in a coordinated diagram between the info and result layers recognize a MLP.
10
Genetic Algorithm
The genetic calculation utilises regular choice, the system that prompts natural advancement, to
take care of restricted and unconstrained improvement issues. The genetic calculation over and over
changes a populace of individual arrangements. Qualities are an assortment of elements (factors) that
characterise an individual qualities are consolidated to frame a chromosome (arrangement). An
individual's set of genes is represented by a genetic algorithm as an alphabetic string. The use of binary
values (a string of 1s and 0s) is common.
11
3.3 Non Functional Requirements
All the other requirements which do not form a part of the above specification are categorised
as Non-Functional needs. A system perhaps needed to give the user a show of the quantity of records
during info. If the quantity must be updated in real time, the system architects should make sure that
the system is capable of changing the displayed record count at intervals associated with the tolerably
short interval of the quantity of records dynamic. Comfortable network information measure may
additionally be a non-functional requirement of a system.
● Accessibility
● Availability
● Backup
● Certification
● Compliance
● Configuration Management
● Documentation
● Disaster Recovery
● Efficiency(resource consumption for given load)
● Interoperability
12
3.4 Performance Requirements
The requirement specification for any system can be broadly stated as given below:
13
3.5 Feasibility Study:
Preliminary investigation examines project feasibility; the likelihood the system will be useful
to the organisation. The main objective of the feasibility study is to test the Technical, Operational
and Economical feasibility for adding new modules and debugging old running systems. All systems
are feasible if they are given unlimited resources and infinite time. There are aspects in the feasibility
study portion of the preliminary investigation:
● Technical Feasibility
● Operation Feasibility
● Economical Feasibility
The technical issue usually raised during the feasibility stage of the investigation includes the
following:
User-Friendly
Customers will use the forms for their various transactions i.e. for adding new routes, viewing
the route's details. Also the Customer wants the reports to view the various transactions based on the
constraints. These forms and reports are generated as user-friendly to the Client.
Reliability
The package will pick-up current transactions online. Regarding the old transactions, User will
enter them into the system.
14
Security
The web server and database server should be protected from hacking, virus etc
Portability
The application will be developed using standard open source software (Except Oracle) like
Java, tomcat web server, Internet Explorer Browser etc these software will work both on Windows
and Linux o/s. Hence portability problems will not arise.
Availability
Maintainability
The system uses the 2-tier architecture. The 1st tier is the GUI, which is said to be front-end
and the 2nd tier is the database, which uses My-Sql, which is the back-end.
The front-end can be run on different systems (clients). The database will be running at the server.
Users access these forms by using the user-ids and the passwords.
The computerised system takes care of the present existing system’s data flow and procedures
completely and should generate all the reports of the manual system besides a host of other
management reports.
It should be built as a web based application with separate web server and database server. This
is required as the activities are spread throughout the organisation and customers want a centralised
database. Further some of the linked transactions take place in different locations.
15
4. PROJECT DESCRIPTION
SDLC is nothing but Software Development Life Cycle. It is a standard which is used by the
software industry to develop good software.
16
4.1.2 Requirements Gathering Stage
The requirements gathering process takes as its input the goals identified in the high-level
requirements section of the project plan. Each goal will be refined into a set of one or more
requirements. These requirements define the major functions of the intended application, define
operational data areas and reference data areas, and define the initial data entities. Major functions
include critical processes to be managed, as well as mission critical inputs, outputs and reports. A
user class hierarchy is developed and associated with these major functions, data areas, and data
entities. Each of these definitions is termed a Requirement. Requirements are identified by unique
requirement identifiers and, at minimum, contain a requirement title and textual description.
17
These requirements are fully described in the primary deliverables for this stage: the
Requirements Document and the Requirements Traceability Matrix (RTM). The requirements
document contains complete descriptions of each requirement, including diagrams and references to
external documents as necessary. Note that detailed listings of database tables and fields are not
included in the requirements document.
The title of each requirement is also placed into the first version of the RTM, along with the
title of each goal from the project plan. The purpose of the RTM is to show that the product
components developed during each stage of the software development lifecycle are formally
connected to the components developed in prior stages.
In the requirements stage, the RTM consists of a list of high-level requirements, or goals, by
title, with a listing of associated requirements for each goal, listed by requirement title. In this
hierarchical listing, the RTM shows that each requirement developed during this stage is formally
linked to a specific product goal. In this format, each requirement can be traced to a specific product
goal, hence the term requirements traceability.
The outputs of the requirements definition stage include the requirements document, the RTM,
and an updated project plan.
Feasibility study is all about identification of problems in a project, the number of staff required
to handle a project is represented as Team Formation, in this case only modules with individual tasks
will be assigned to employees who are working for that project.
Project Specifications are all about representing various possible inputs submitting to the server and
corresponding outputs along with reports maintained by the administrator.
18
4.1.3 Analysis Stage
The planning stage establishes a bird's eye view of the intended software product, and uses this
to establish the basic project structure, evaluate feasibility and risks associated with the project, and
describe appropriate management and technical approaches.
The most critical section of the project plan is a listing of high-level product requirements, also
referred to as goals. All of the software product requirements to be developed during the
requirements definition stage flow from one or more of these goals. The minimum information for
each goal consists of a title and textual description, although additional information and references to
external documents may be included. The outputs of the project planning stage are the configuration
management plan, the quality assurance plan, and the project plan and schedule, with a detailed
listing of scheduled activities for the upcoming Requirements stage, and high level estimates of effort
for the out stages.
19
4.1.4 Designing Stage
The design stage takes as its initial input the requirements identified in the approved
requirements document. For each requirement, a set of one or more design elements will be produced
as a result of interviews, workshops, and/or prototype efforts. Design elements describe the desired
software features in detail, and generally include functional hierarchy diagrams, screen layout
diagrams, tables of business rules, business process diagrams, pseudo code, and a complete
entity-relationship diagram with a full data dictionary.
These design elements are intended to describe the software in sufficient detail that skilled
programmers may develop the software with minimal additional input.
When the design document is finalized and accepted, the RTM is updated to show that each
design element is formally associated with a specific requirement. The outputs of the design stage are
the design document, an updated RTM, and an updated project plan.
20
4.1.5 Development (Coding) Stage
The development stage takes as its primary input the design elements described in the approved
design document. For each design element, a set of one or more software artefacts will be produced.
During the integration and test stage, the software artefacts, online help, and test data are
migrated from the development environment to a separate test environment. At this point, all test
cases are run to verify the correctness and completeness of the software. Successful execution of the
test suite confirms a robust and complete migration capability. During this stage, reference data is
finalized for production use and production users are identified and linked to their appropriate roles.
The final reference data (or links to reference data source files) and production user list are compiled
into the Production Initiation Plan.
21
4.1.7 Installation & Acceptance Test
During the installation and acceptance stage, the software artefacts, online help, and initial
production data are loaded onto the production server. At this point, all test cases are run to verify the
correctness and completeness of the software. Successful execution of the test suite is a prerequisite
to acceptance of the software by the customer.
After customer personnel have verified that the initial production data load is correct and the
test suite has been executed with satisfactory results, the customer formally accepts the delivery of
the software.
4.1.8 Maintenance
Outer rectangle represents maintenance of a project, Maintenance team will start with
requirement study, understanding of documentation, and later employees will be assigned work and
they will undergo training in that particular assigned category
22
5. SYSTEM DESIGN
5.1 Architecture
The goal is for UML to become a common language for creating models of object oriented
computer software. In its current form UML comprises two major components: a Meta-model and a
notation. In the future, some form of method or process may also be added to; or associated with,
UML.
23
The Unified Modeling Language is a standard language for specifying, Visualization,
Constructing and documenting the artefacts of software systems, as well as for business modelling
and other non-software systems.
The UML represents a collection of best engineering practices that have proven successful
in the modelling of large and complex systems.
The UML is a very important part of developing objects oriented software and the software
development process. The UML uses mostly graphical notations to express the design of software
projects.
5.2.1 Goals:
1. Provide users a ready-to-use, expressive visual modelling Language so that they can develop
and exchange meaningful models.
6. Support higher level development concepts such as collaborations, frameworks, patterns and
components.
24
5.3 Use Case Diagram
A use case diagram in the Unified Modeling Language (UML) is a type of behavioural
diagram defined by and created from a Use-case analysis. Its purpose is to present a graphical
overview of the functionality provided by a system in terms of actors, their goals (represented as
use cases), and any dependencies between those use cases. The main purpose of a use case diagram
is to show what system functions are performed for which actor. Roles of the actors in the system
can be depicted
25
5.4 Class Diagram:
In software engineering, a class diagram in the Unified Modeling Language (UML) is
a type of static structure diagram that describes the structure of a system by showing the system's
classes, their attributes, operations (or methods), and the relationships among the classes. It explains
which class contains information.
26
5.5 Sequence Diagram:
27
5.6 Activity Diagram
Activity diagrams are graphical representations of workflows of stepwise activities and actions
with support for choice, iteration and concurrency. In the Unified Modeling Language, activity
diagrams can be used to describe the business and operational step-by-step workflows of components
in a system. An activity diagram shows the overall flow of control.
28
6. SYSTEM TESTING
Testing has become an integral part of any system or project especially in the field of
information technology. The importance of testing is a method of justifying, if one is ready to move
further, be it to check if one is capable to with stand the rigors of a particular situation cannot be
underplayed and that is why testing before development is so critical. When the software is
developed before it is given to the user the software must be tested whether it is solving the purpose
for which it is developed. This testing involves various types through which one can ensure the
software is reliable. The program was tested logically and patterns of execution of the program for a
set of data are repeated. Thus the code was exhaustively checked for all possible correct data and the
outcomes were also checked.
To locate errors, each module is tested individually. This enables us to detect errors and correct
them without affecting any other modules. Whenever the program is not satisfying the required
function, it must be corrected to get the required result. Thus all the modules are individually tested
from bottom up starting with the smallest and lowest modules and proceeding to the next level. Each
module in the system is tested separately. For example the job classification module is tested
separately. This module is tested with different job and its approximate execution time and the result
of the test is compared with the results that are prepared manually. Each module in the system is
tested separately. In this system the resource classification and job scheduling modules are tested
separately and their corresponding results are obtained which reduces the process waiting time.
After the module testing, the integration testing is applied. When linking the modules there
may be a chance for errors to occur, these errors are corrected by using this testing. In this system all
modules are connected and tested. The testing results are very correct. Thus the mapping of jobs with
resources is done correctly by the system
29
6.4 Acceptance Testing
When that user finds no major problems with its accuracy, the system passers through a final
acceptance test. This test confirms that the system needs the original goals, objectives and
requirements established during analysis without actual execution which eliminates waste of time
and money acceptance tests on the shoulders of users and management, it is finally acceptable and
ready for the operation.
30
04 Prediction Whether If not Random Random High High
Random Prediction applied tree is not tree is
Forest algorithm generated generated
applied on
the data or
not
31
7. CONCLUSION & FUTURE ENHANCEMENT
7.1 Conclusion
● There are some specific problems in the world that pushes the capability of data science and the
technology available in this field to their edge among them one is rainfall prediction
● I can easily conclude that for rainfall prediction this is the best way to use it by forming a range
of highest and lowest predicted values by adding bias in the model
● Rainfall prediction main objective is prediction of amount of rain in a specific well or division
by using various techniques and finding out which one is best Future scope of rainfall
prediction
● The future scope of rainfall prediction is very promising, with advancements in technology and
data analysis techniques. Some of the potential developments in this field include:
● Improvements in Data Collection
● Integration of Big Data
● Advances in Cloud Computing
● Development of Early Warning System.
● In summary, the future of rainfall prediction looks bright, and with continued research and
innovation, I can expect more accurate and reliable predictions that can help people and
communities prepare for extreme weather events.
32
7.2 Future Enhancement
It is not possible to develop a system that makes all the requirements of the user. User
requirements keep changing as the system is being used. Some of the future enhancements that can
be done to this system are:
● As the technology emerges, it is possible to upgrade the system and can be adaptable to the
desired environment.
● Based on the future security issues, security can be improved using emerging technologies like
single sign-on.
33
8. BIBLIOGRAPHY
1. Xiong, Lihua, and Kieran M. O'Connor. "An empirical method to improve the prediction limits
of the GLUE methodology in rainfall runoff modelling." *Journal of Hydrology* 349.1-2
(2008): 115-124.
2. Schmitz, G. H., and J. Cullmann. "PAI-OFF: A new proposal for online flood forecasting in
flash flood prone catchments." *Journal of Hydrology* 360.1-4 (2008): 1-14.
3. Riordan, Denis, and Bjarne K. Hansen. "A fuzzy case based system for weather prediction."
*Engineering Intelligent Systems for Electrical Engineering and Communications* 10.3
(2002): 139-146.
4. Guhathakurta, P. "Long-range monsoon rainfall prediction of 2005 for the districts and
sub-division Kerala with artificial neural network." *Current Science* 90.6 (2006): 773-779.
5. Pilgrim, D. H., T. G. Chapman, and D. G. Doran. "Problems of rainfall-runoff modelling in arid
and semiarid regions." *Hydrological Sciences Journal* 33.4 (1988): 379-400.
6. Lee, Sunyoung, Sungzoon Cho, and Patrick M. Wong. "Rainfall prediction using artificial
neural networks." *Journal of Geographic Information and Decision Analysis* 2.2 (1998):
233-242.
7. French, Mark N., Witold F. Krajewski, and Robert R. Cuykendall. "Rainfall forecasting in
space and time using a neural network." *Journal of Hydrology* 137.1-4 (1992): 1-31.
8. Charaniya, Nizar Ali, and Sanjay V. Dudul. "Committee of artificial neural networks for
monthly rainfall prediction using wavelet transform." *Business, Engineering and Industrial
Applications (IBERIA), 2011 International Conference on.* IEEE, 2011.
9. Noone, David, and Harvey Stern. "Verification of rainfall forecasts from the Australian Bureau
of Meteorology’s Global Assimilation and Prognosis (GASP) system." *Australian
Meteorological Magazine* 44.4 (1995): 275-286.
10. Hornik, Kurt, Maxwell Stinchcombe, and Halbert White. "Multilayer feedforward networks are
universal approximators." *Neural Networks* 2.5 (1989): 359-366.
34
APPENDIX
A) Sample Forms
● This dataset contains 641 number of rows and 19 columns each row has Indian state name
,district ,rainfall amount in cm/inches of each and every month from January to December
● As this dataset considered as supervised because it is labelled dataset to train algorithms that
to classify data or predict outcomes accurately
● JAN
● FEB
● MAR
● APR
● MAY
● JUNE
● JUL
● AUG
● SEP
● OCT
● NOV
● DEC
● JAN-FEB
● MAR-MAY
● JUN-SEP
35
36
37
38
39
40
B) Sample Code
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"id": "AQGE6qwHiFM3"
},
"outputs": [],
"source": [
"import pandas as pd"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "FBu9HvXkk-L2"
},
"source": [
"# New Section"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ljCw5KfzadDZ"
},
"source": []
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"colab": {
41
"base_uri": "https://localhost:8080/"
},
"executionInfo": {
"elapsed": 1390,
"status": "ok",
"timestamp": 1679409641791,
"user": {
"displayName": "Soumika Reddy",
"userId": "05586591973105306191"
},
"user_tz": -330
},
"id": "HUHAGZulie8v",
"outputId": "735b2bb2-3ee8-4608-f039-8eee1fcca30f"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" STATE_UT_NAME DISTRICT JAN FEB MAR APR \\\n",
"0 ANDAMAN And NICOBAR ISLANDS NICOBAR 107.3 57.9 65.2 117.0 \n",
"1 ANDAMAN And NICOBAR ISLANDS SOUTH ANDAMAN 43.7 26.0 18.6 90.5 \n",
"2 ANDAMAN And NICOBAR ISLANDS N & M ANDAMAN 32.7 15.9 8.6 53.4 \n",
"3 ARUNACHAL PRADESH LOHIT 42.2 80.8 176.4 358.5 \n",
"4 ARUNACHAL PRADESH EAST SIANG 33.3 79.5 105.9 216.5 \n",
".. ... ... ... ... ... ... \n",
"636 KERALA IDUKKI 13.4 22.1 43.6 150.4 \n",
"637 KERALA KASARGOD 2.3 1.0 8.4 46.9 \n",
"638 KERALA PATHANAMTHITTA 19.8 45.2 73.9 184.9 \n",
"639 KERALA WAYANAD 4.8 8.3 17.5 83.3 \n",
"640 LAKSHADWEEP LAKSHADWEEP 20.8 14.7 11.8 48.9 \n",
"\n",
" MAY JUN JUL AUG SEP OCT NOV DEC ANNUAL Jan-Feb \\\n",
"0 358.5 295.5 285.0 271.9 354.8 326.0 315.2 250.9 2805.2 165.2 \n",
"1 374.4 457.2 421.3 423.1 455.6 301.2 275.8 128.3 3015.7 69.7 \n",
42
"2 343.6 503.3 465.4 460.9 454.8 276.1 198.6 100.0 2913.3 48.6 \n",
"3 306.4 447.0 660.1 427.8 313.6 167.1 34.1 29.8 3043.8 123.0 \n",
"4 323.0 738.3 990.9 711.2 568.0 206.9 29.5 31.7 4034.7 112.8 \n",
".. ... ... ... ... ... ... ... ... ... ... \n",
"636 232.6 651.6 788.9 527.3 308.4 343.2 172.9 48.1 3302.5 35.5 \n",
"637 217.6 999.6 1108.5 636.3 263.1 234.9 84.6 18.4 3621.6 3.3 \n",
"638 294.7 556.9 539.9 352.7 266.2 359.4 213.5 51.3 2958.4 65.0 \n",
"639 174.6 698.1 1110.4 592.9 230.7 213.1 93.6 25.8 3253.1 13.1 \n",
"640 171.7 330.2 287.7 217.5 163.1 157.1 117.7 58.8 1600.0 35.5 \n",
"\n",
" Mar-May Jun-Sep Oct-Dec \n",
"0 540.7 1207.2 892.1 \n",
"1 483.5 1757.2 705.3 \n",
"2 405.6 1884.4 574.7 \n",
"3 841.3 1848.5 231.0 \n",
"4 645.4 3008.4 268.1 \n",
".. ... ... ... \n",
"636 426.6 2276.2 564.2 \n",
"637 272.9 3007.5 337.9 \n",
"638 553.5 1715.7 624.2 \n",
"639 275.4 2632.1 332.5 \n",
"640 232.4 998.5 333.6 \n",
"\n",
"[641 rows x 19 columns]\n"
]
}
],
"source": [
"import pandas as pd\n",
"d=pd.read_csv('district.csv')\n",
"print(d)"
]
},
{
"cell_type": "code",
"execution_count": 3,
43
"metadata": {
"id": "Pk3uVvCnit9l"
},
"outputs": [],
"source": [
"from matplotlib import pyplot as pt"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 282
},
"executionInfo": {
"elapsed": 29,
"status": "ok",
"timestamp": 1679409641792,
"user": {
"displayName": "Soumika Reddy",
"userId": "05586591973105306191"
},
"user_tz": -330
},
"id": "5bKuGKcEi3iR",
"outputId": "501a49d3-59a6-4458-b834-0798e0927473"
},
"outputs": [
{
"data": {
"text/plain": [
"[<matplotlib.lines.Line2D at 0x19fe54ea520>]"
]
},
44
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png":
"text/plain": [
"<Figure size 640x480 with 2 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"import seaborn as sn\n",
"import matplotlib.pyplot as plt\n",
"sn.heatmap(cov_matrix,annot=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "kdOz-W8Vkx94"
},
"outputs": [],
"source": []
}
],
"metadata": {
"colab": {
"provenance": [
{
"file_id": "1H_D1oEGl3hNyZhhz7mvGmZuksWQLZYJD",
45
"timestamp": 1675063582832
}
],
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.13"
}
},
"nbformat": 4,
"nbformat_minor": 1
}
46