0% found this document useful (0 votes)
70 views27 pages

Project Documentaiotn - InDIA Abellllll

Uploaded by

Abel Soby Joseph
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views27 pages

Project Documentaiotn - InDIA Abellllll

Uploaded by

Abel Soby Joseph
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 27

CLASS XII

RECORD OF PROJECT WORK IN


INFORMATICS PRACTICES

Abel Soby Joseph


Name : -------------------------------------------------------

Roll No : -----------------------------------------------------
Project Report submitted in fulfillment of Class XII

Syllabus Requirement By

--------------------------------------------------------------------

This is to certify that this Project titled ------------------------------------------------------

----------------------------------------------------------------------------------------------------

is the record of bonafide project work carried out by ------------------------------------

----------------------------------------------------Roll . No. -------------------------of class XII,

Section , during the academic year 2023 – 20224

Teacher in Charge Principal Examiner


Acknowledgement
This project is the fruit of the diligent labour that has been invested in it by our group.

Apart from our personal hard work, this project has received great support from the

school, without which its presentation would have been impossible.

We take this opportunity to thank our manager Rev. Fr. John Panchickal, our Principal

Fr. Antony T L , Vice Principal Fr. Shiju Varghese Kandaplackal and our teachers who

have helped us in our endeavor. A special mention is to make about our computer

teachers, Mr. Cherian K Abraham, and Mrs. Thanuja Mathew in this regard. We would

also like to thank our friends for their co-operation and suggestions. Finally, we thank

the Almighty for all his blessings.


ABSTRACT

When confronted with a large amount of data, we seek to summarize the data into
statistics that capture the essence of the data with as few numbers as possible. This project
provides a statistical analysis of the COVID patient data collected from the States of India
over a period of one month. . It contains data of infected people with the COVID-19 virus.
This data set has 36 observations that include the STATE wise details of Total cases, Active
cases, Discharged cases, Number of Deaths, Active case ratio, Discharge cases ratio and
Death ratio etc collected over a month.

Graphing the data has a similar goal: to reduce the data to an image that represents
all the key aspects of the raw data. Various types of plots were used in the project to
summarize the statistics.
Contents

1. INTRODUCTION

2. SYSTEM ANALYSIS

2.1 PROJECT AIM AND OBJECTIVES

2.2 PROBLEM STATEMENT

2.3 SYSTEM SPECIFICATIONS

3. SYSTEM DESIGN

3.1 CSV FILE DESIGN

3.2 EXPLORING THE DATA WITH PANDAS

3.3 PLOTTING SUMMARY STATISTICS

3.4 MENU STRUCTURE

4. SYSTEM IMPLEMENTATION

4.1 SOURCE CODE

4.2 SCREEN SHOTS

5. CONCLUSION

6. REFERENCES
1. INTRODUCTION

What is Pandas?
Pandas is a package commonly used to deal with data analysis. It simplifies the loading
of data from external sources such as text files and databases, as well as providing ways
of analysing and manipulating data once it is loaded into your computer. The features
provided in pandas automate and simplify a lot of the common tasks that would take
many lines of code to write in the basic Python language.

Pandas is a hugely popular, and still growing Python library which is used across a range
of disciplines like environmental and climate science, social science, linguistics, biology,
as well as a number of applications in industry such as data analytics, financial trading,
and many others.

Pandas is best suited for structured, labelled data, in other words, tabular data, that has
headings associated with each column of data. Pandas is fast. Python sometimes gets a
bad rap for being a bit slow compared to ‘compiled’ languages such as C and Fortran. But
deep down in the internals of Pandas, it is actually written in C, and so processing large
datasets is no problem for Pandas.

The results obtained after analysis is used to make inferences or draw conclusions about
data as well as to make important business decisions. Sometimes, it is not easy to infer
by merely looking at the results. In such cases, visualisation helps in better
understanding of results of the analysis. Data visualisation means graphical or pictorial
representation of the data using graph, chart, etc. The purpose of plotting data is to
visualize variation or show relationships between variables. Visualisation also helps to
effectively communicate information to intended users.
2. SYSTEM ANALYSIS

2.1 PROJECT AIMS AND OBJECTIVES

Coronavirus disease 2019 (COVID-19) is a contagious disease caused by severe


acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The first case was identified in
Wuhan, China, in December 2019. It has since spread worldwide, leading to an ongoing
pandemic. Most people infected with the COVID-19 virus will experience mild to
moderate respiratory illness and recover without requiring special treatment. Older
people, and those with underlying medical problems like cardiovascular disease,
diabetes, chronic respiratory disease, and cancer are more likely to develop serious
illness. A new strain of coronavirus is discovered in Britain. It is said to mutate faster
than the older variant.

The first case of COVID-19 in India, which originated from China, was reported on
30 January 2020. India currently has the largest number of confirmed cases in Asia and
has the second-highest number of confirmed cases in the world after the United States.

2.2 PROBLEM STATEMENT:

Analysis of COVID patient data collected from different states of India over a period
of one month. For this project, we took data from the web site of Govt of India. It contains
data of infected people with the COVID-19 virus. This data set has 36 observations that
include the STATE wise details of Total cases, Active cases, Discharged cases, Number of
Deaths, Active case ratio, Discharge cases ratio and Death ratio etc. collected over a month.
2.3 SYSTEM SPECIFICATIONS

HARDWARE SPECIFICATIONS

The following is the hardware specification of the system on which the software
has been developed:-

Operating System : Windows VISTA/ 7 /10 or UBUNTU

Windows 7 is used as the operating system as it is stable and supports more


features and is more user friendly.

Machine : Pentium Dual Core Processor 2.6 GHz or

above, 2 GB RAM or above ,

500 GB Hard Disk or above

We used Intel core i5 2nd generation based system, it is fast than other processors
and provide reliable and stable performance and we can run our pc for longtime.
By using this processor, we can keep on developing our project without any
worries. 4gb RAM is used as it will provide fast reading and writing capabilities
and will support in processing.
SOFTWARE SPECIFICATIONS

Front End Used : PYTHON 3.8.0 or above

Backend Used : CSV Files


3. SYSTEM DESIGN
3.1 CSV FILE DESIGN

For this project, we took data from the web site of Govt of India. It contains data of
infected people with the COVID-19 virus. This data set has 36 observations that include the
STATE wise details of Total cases, Active cases, Discharged cases, Number of Deaths, Active
case ratio, Discharge cases ratio and Death ratio etc. collected over a month.
COVIDIN.CSV

STAT ACTIV DISCHARG DEATH ACRATI DISRATI DERATI


E STATE/UTS TOTAL E ED S O O O
AN Andaman And Nicobar 7560 3 7428 129 0.04 98.25 1.71
201056
AP Andhra Pradesh 6 14853 1981906 13807 0.74 98.57 0.69
AR Arunachal Pradesh 52831 1063 51508 260 2.01 97.5 0.49
AS Assam 588025 7434 574955 5636 1.26 97.78 0.96
BR Bihar 725683 112 715918 9653 0.02 98.65 1.33
CH Chandigarh 65087 38 64236 813 0.06 98.69 1.25
100437
CG Chhattisgarh 9 510 990314 13555 0.05 98.6 1.35
Dadra And Nagar Haveli
DH And DD 10663 6 10653 4 0.06 99.91 0.04
143768
DL Delhi 5 393 1412212 25080 0.03 98.23 1.74
GA Goa 173717 952 169572 3193 0.55 97.61 1.84
GJ Gujarat 825386 151 815154 10081 0.02 98.76 1.22
HR Haryana 770445 657 760115 9673 0.09 98.66 1.26
HP Himachal Pradesh 213122 1814 207717 3591 0.85 97.46 1.68
JK Jammu And Kashmir 324979 1211 319362 4406 0.37 98.27 1.36
JH Jharkhand 347829 142 342555 5132 0.04 98.48 1.48
294599
KA Karnataka 3 18923 2889809 37261 0.64 98.09 1.26
397757 20544
KL Kerala 2 0 3751666 20466 5.16 94.32 0.51
LK Ladakh 20544 72 20265 207 0.35 98.64 1.01
LD Lakshadweep 10336 29 10256 51 0.28 99.23 0.49
MP Madhya Pradesh 792143 79 781548 10516 0.01 98.66 1.33
645227 13702
MH Maharashtra 3 55341 6259906 6 0.86 97.02 2.12
MN Manipur 112941 3424 107742 1775 3.03 95.4 1.57
ML Meghalaya 75336 2710 71327 1299 3.6 94.68 1.72
MZ Mizoram 57522 8342 48968 212 14.5 85.13 0.37
NL Nagaland 29920 828 28478 614 2.77 95.18 2.05
100565
OR Odisha 4 7093 990796 7765 0.71 98.52 0.77
PY Puducherry 123298 698 120790 1810 0.57 97.97 1.47
PB Punjab 600514 405 583743 16366 0.07 97.21 2.73
RJ Rajasthan 954079 116 945009 8954 0.01 99.05 0.94
SK Sikkim 29729 1333 28027 369 4.48 94.27 1.24
261029
TN Tamil Nadu 9 17559 2557884 34856 0.67 97.99 1.34
TL Telengana 657119 6065 647185 3869 0.92 98.49 0.59
TR Tripura 82775 1028 80950 797 1.24 97.8 0.96
170923
UP Uttar Pradesh 4 299 1686128 22807 0.02 98.65 1.33
UK Uttarakhand 342894 326 335188 7380 0.1 97.75 2.15
154689
WB West Bengal 8 9109 1519372 18417 0.59 98.22 1.19
3.2 EXPLORING THE DATA WITH PANDAS

Pandas is an excellent toolkit for working with real world data that often have a
tabular structure (rows and columns). The main data structures in Pandas are
implemented with Series and DataFrame classes. The former is a one-dimensional
indexed array of some fixed data type. The latter is a two-dimensional data structure - a
table - where each column contains data of the same type.

 Pandas DataFrame (a 2-dimensional data structure) is used for storing and


mainpulating table-like data (data with rows and columns) in Python. You can
think of a pandas DataFrame as a programmable spreadsheet.
 Pandas Series (a 1-dimensional data structure) is used for storing and
manipulating a sequence of values. pandas Series is kind of like a list, but more
clever. One row or one column in a pandas DataFrame is actually a pandas Series

Pandas makes it very convenient to load, process, and analyze such tabular data
using SQL-like queries.

3.3 PLOTTING SUMMARY STATISTICS

When confronted with a large amount of data, we seek to summarize the data
into statistics that capture the essence of the data with as few numbers as possible.
Graphing the data has a similar goal: to reduce the data to an image that represents all
the key aspects of the raw data.
For this project, we took data from the web site of Govt of India. It contains data of
infected people with the COVID-19 virus. This data set has 36 observations that include the
STATE wise details of Total cases, Active cases, Discharged cases, Number of Deaths, Active
case ratio, Discharge cases ratio and Death ratio etc. collected over a month. Various types
of plots were used in the project to summarize the statistics.
3.4 MENU STRUCTURE

1. For Data Analysis


1. Display the State wise summary
2. Display the Highest TOTAL / ACTIVE / DISCHARGED / DEATHS cases
3. Display the Lowest Confirmed/Recovered/Active/Death cases
4. Display the Satewise Confirmed/Recovered/Active/Death cases
5. Display State Wise ACTIVE , DISCHARGE ,DEATH RATIOs
6. Exit
2. For Data Visualisation
1. Total Cases
2. Active Cases
3. Discharge Cases
4. Death Cases
5. Exit
3. For Data Manipulation

4. Exit
4. SYSTEM IMPLEMENTATION
4.1 SOURCE CODE AND MODULE

DESCRIPTION import pandas as pd

import csv

import matplotlib.pyplot as plt

import time

import sys

import os

global covdf

try:

covdf=pd.read_csv("C:\\PYTHON38\\IP_PROJECTS\\COVID\\covidin.csv")

except Exception as e:

print(e) print('\

t\t',end='') for i in

range (38):

print('*',end='')

print()

print('\t\t* Welcome to COVID Patients Analysis *')

print('\t\t',end='')

for i in range (38):

print('*',end='')

print('\n\n')

while True:

print("1. For Data Analysis")

print("2. For Data Visualisation")


print("3. For Data Manipulation")

print("4. Exit")

option=int(input("Enter your choice"))

if option==1:

while True:

print("1. Display the State wise summary ")

print("2. Display the Highest TOTAL / ACTIVE / DISCHARGED / DEATHS cases")

print("3. Display the Lowest Confirmed/Recovered/Active/Death cases")

print("4. Display the Satewise Confirmed/Recovered/Active/Death cases")

print("5. Display State Wise ACTIVE , DISCHARGE ,DEATH RATIOs ")

print("6. Exit")

choice=int(input(" Enter Your choice : "))

state='''

AN Andaman And Nicobar AP Andhra Pradesh AR


Arunachal Pradesh

AS Assam BR Bihar CH Chandigarh

CG Chhattisgarh DH Dadra And Nagar Haveli DL Delhi

GA Goa GJ Gujarat HR Haryana

HP Himachal Pradesh JK Jammu And Kashmir JH


Jharkhand

KA Karnataka KL Kerala LK Ladakh

LD Lakshadweep MP Madhya Pradesh MH


Maharashtra

MN Manipur ML Meghalaya MZ Mizoram

NL Nagaland OR Odisha PY
Puducherry
PB Punjab RJ Rajasthan SK Sikkim

TN Tamil Nadu TL Telengana TR Tripura

UP Uttar Pradesh UK Uttarakhand WB West Bengal '''

if choice==1:

st=input(" Enter the State Code : ").upper()

print(st)

covdf=covdf.set_index('STATE')

if st in ['AN',
'AP','AR','AS','BR','CH','CG','DH','DL','GA','GJ','HR','HP','JK','JH','KA','KL','LK','LD','MP','MH','MN','M
L','MZ','NL','OR','PY','PB','RJ','SK','TN','TL','TR','UP','UK','WB']:

data=covdf.loc[st]

print(data)

else:

print("Wrong Sate Code")

elif choice==2:

data=covdf[covdf.TOTAL==covdf.TOTAL.max()]

print("Highest Total cases : ", data['TOTAL'])

data=covdf[covdf.ACTIVE==covdf.ACTIVE.max()]

print("Highest Active cases : ", data['ACTIVE'])

data=covdf[covdf.DISCHARGED==covdf.DISCHARGED.max()]

print("Highest Discharged cases : ", data['DISCHARGED'])

data=covdf[covdf.DEATHS==covdf.DEATHS.max()]

print("Highest Death cases : ", data['DEATHS'])

elif choice==3:

data=covdf[covdf.TOTAL==covdf.TOTAL.min()]

print("Lowest Total cases : ", data['TOTAL'])


data=covdf[covdf.ACTIVE==covdf.ACTIVE.min()]

print("Lowest Active cases : ", data['ACTIVE'])

data=covdf[covdf.DISCHARGED==covdf.DISCHARGED.min()]

print("Lowest Discharged cases : ", data['DISCHARGED'])

data=covdf[covdf.DEATHS==covdf.DEATHS.min()]

print("Lowest Death cases : ", data['DEATHS'])

elif choice==4:

print("Total Cases : ", covdf.TOTAL.sum())

print("Total Active Cases : ", covdf.ACTIVE.sum())

print("Total Discharged Cases : ", covdf.DISCHARGED.sum())

print("Total Death Cases : ", covdf.DEATHS.sum())

elif choice==5: covdf=pd.read_csv("C:\\PYTHON38\\IP_PROJECTS\\

COVID\\covidin.csv") covdf=covdf.set_index('STATE')

ratioDF=covdf[covdf.columns[5:8]]

print(ratioDF)

elif choice==6:

break

elif option==2:

while True:

print("\t\t\t 1. Total Cases")

print("\t\t\t 2. Active Cases")

print("\t\t\t 3. Discharge Cases")

print("\t\t\t 4. Death Cases")

print("\t\t\t 5. Exit")

option=int(input("\t\t\t Enter your choice"))


covdf=pd.read_csv("C:\\PYTHON38\\IP_PROJECTS\\COVID\\covidin.csv")

if option==1:

tot=covdf[['STATE','TOTAL']]

tot=tot.set_index('STATE')

print(tot)

tot.plot(kind='bar')

plt.title('Statewise Total Cases')

plt.xlabel('State')

plt.ylabel('Total Cases ')

plt.show()

elif option==2:

tot=covdf[['STATE','ACTIVE']]

tot=tot.set_index('STATE')

print(tot)

tot.plot(kind='bar')

plt.title('Statewise Active Cases')

plt.xlabel('State')

plt.ylabel('Active Cases')

plt.show()

elif option==3:

tot=covdf[['STATE','DISCHARGED']]

tot=tot.set_index('STATE')

print(tot)

tot.plot(kind='bar')

plt.title('Statewise Discharge Cases')

plt.xlabel('State')
plt.ylabel('Discharge Cases')

plt.show()

elif option==4:

tot=covdf[['STATE','DEATHS']]

tot=tot.set_index('STATE')

print(tot)

tot.plot(kind='bar')

plt.title('Statewise Death Cases')

plt.xlabel('State')

plt.ylabel('Death Cases')

plt.show()

elif option==5:

break

elif option==3:

try:

covdf=pd.read_csv("C:\\PYTHON38\\IP_PROJECTS\\COVID\\COVID.csv")

except Exception as e:

print(e)

print(covdf)

rowidx=int(input("Enter the row index/label to edit the value"))

colidx=input("Enter the column index/label to edit the value")

val=covdf.loc[rowidx,colidx]

ch=input("do you really want to update the columns value "+str(val) +" [Y/N]")

if ch in 'YyNn':

newval=int(input("Enter the new value"))

covdf.loc[rowidx,colidx]=newval
print("Data frame updated successfully............")

print(covdf)

covdf.to_csv("C:\\PYTHON38\\IP_PROJECTS\\COVID\\COVID.csv", index=False)

print("CSV file updated successfully...........")

elif option==4:

os.system('cls') # works fine in DOS environment

print("Thank you for using COVID ANALYSIS SYSTEM")

print("System exiting.............")

time.sleep(5)

break

else:

print("Wrong input.........Try again")


4.2 SCREEN SHOTS
5. CONCLUSION
Since pandas is a large library with many different specialist features and functions, the

project described here focus mainly on the fundamentals of manipulating data (indexing,

grouping, aggregating etc ), making use of the core DataFrame and Series objects. The

software appears very flexible since it is menu driven with user-friendly screens. No Formal

programming knowledge is required for the user. Also, the user is not burdened with data

storing and data retrieval procedures as both are done internally. Visualisation also helps to

effectively communicate information to intended users.


6. REFERENCES
1) Informatics Practices
By Sumita Arora

2) Python The Complete Reference


By Martin C Brown

3) Programming & Problem Solving Through


Python By Sathish Jain & Shashi Singh

4) Python for Beginners


By Prof. Rahul E. Borate

5) Data Analysis and Visualization with Pandas


By Purna Chander Rao

You might also like