0% found this document useful (0 votes)
13 views22 pages

DSBDA Mini Project - Ipynb - Colab

The document outlines a mini project by Nikhil Singh on analyzing COVID-19 vaccination data in India, utilizing libraries such as NumPy, Pandas, Seaborn, and Matplotlib. It includes data reading, displaying the top and bottom rows of the dataset, and provides descriptive statistics of various vaccination metrics. The dataset consists of 7845 rows and 24 columns, detailing doses administered, sessions, and demographics of vaccinated individuals.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views22 pages

DSBDA Mini Project - Ipynb - Colab

The document outlines a mini project by Nikhil Singh on analyzing COVID-19 vaccination data in India, utilizing libraries such as NumPy, Pandas, Seaborn, and Matplotlib. It includes data reading, displaying the top and bottom rows of the dataset, and provides descriptive statistics of various vaccination metrics. The dataset consists of 7845 rows and 24 columns, detailing doses administered, sessions, and demographics of vaccinated individuals.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

DSBDA Mini Project

March 31, 2025

NAME: Nikhil Singh


ROLL NO.: TECO2425B002
Mini Project

[2]: # Importing the required libraries


import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

[3]: # Reading the csv file


data = pd.read_csv("covid_vaccine_statewise.csv")

[4]: # Top five rows


print("The top five rows are: ")
data.head()

The top five rows are:

[4]: Updated On State Total Doses Administered Sessions Sites \


0 16/01/2021 India 48276.0 3455.0 2957.0
1 17/01/2021 India 58604.0 8532.0 4954.0
2 18/01/2021 India 99449.0 13611.0 6583.0
3 19/01/2021 India 195525.0 17855.0 7951.0
4 20/01/2021 India 251280.0 25472.0 10504.0

First Dose Administered Second Dose Administered \


0 48276.0 0.0
1 58604.0 0.0
2 99449.0 0.0
3 195525.0 0.0
4 251280.0 0.0

Male (Doses Administered) Female (Doses Administered) \


0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN

1
4 NaN NaN

Transgender (Doses Administered) ... 18-44 Years (Doses Administered) \


0 NaN ... NaN
1 NaN ... NaN
2 NaN ... NaN
3 NaN ... NaN
4 NaN ... NaN

45-60 Years (Doses Administered) 60+ Years (Doses Administered) \


0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN

18-44 Years(Individuals Vaccinated) 45-60 Years(Individuals Vaccinated) \


0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN

60+ Years(Individuals Vaccinated) Male(Individuals Vaccinated) \


0 NaN 23757.0
1 NaN 27348.0
2 NaN 41361.0
3 NaN 81901.0
4 NaN 98111.0

Female(Individuals Vaccinated) Transgender(Individuals Vaccinated) \


0 24517.0 2.0
1 31252.0 4.0
2 58083.0 5.0
3 113613.0 11.0
4 153145.0 24.0

Total Individuals Vaccinated


0 48276.0
1 58604.0
2 99449.0
3 195525.0
4 251280.0

[5 rows x 24 columns]

2
[5]: # Last five rows
print("The last five rows are: ")
data.tail()

The last five rows are:

[5]: Updated On State Total Doses Administered Sessions Sites \


7840 11/08/2021 West Bengal NaN NaN NaN
7841 12/08/2021 West Bengal NaN NaN NaN
7842 13/08/2021 West Bengal NaN NaN NaN
7843 14/08/2021 West Bengal NaN NaN NaN
7844 15/08/2021 West Bengal NaN NaN NaN

First Dose Administered Second Dose Administered \


7840 NaN NaN
7841 NaN NaN
7842 NaN NaN
7843 NaN NaN
7844 NaN NaN

Male (Doses Administered) Female (Doses Administered) \


7840 NaN NaN
7841 NaN NaN
7842 NaN NaN
7843 NaN NaN
7844 NaN NaN

Transgender (Doses Administered) ... 18-44 Years (Doses Administered) \


7840 NaN ... NaN
7841 NaN ... NaN
7842 NaN ... NaN
7843 NaN ... NaN
7844 NaN ... NaN

45-60 Years (Doses Administered) 60+ Years (Doses Administered) \


7840 NaN NaN
7841 NaN NaN
7842 NaN NaN
7843 NaN NaN
7844 NaN NaN

18-44 Years(Individuals Vaccinated) \


7840 NaN
7841 NaN
7842 NaN
7843 NaN
7844 NaN

3
45-60 Years(Individuals Vaccinated) 60+ Years(Individuals Vaccinated) \
7840 NaN NaN
7841 NaN NaN
7842 NaN NaN
7843 NaN NaN
7844 NaN NaN

Male(Individuals Vaccinated) Female(Individuals Vaccinated) \


7840 NaN NaN
7841 NaN NaN
7842 NaN NaN
7843 NaN NaN
7844 NaN NaN

Transgender(Individuals Vaccinated) Total Individuals Vaccinated


7840 NaN NaN
7841 NaN NaN
7842 NaN NaN
7843 NaN NaN
7844 NaN NaN

[5 rows x 24 columns]

[6]: # Shape of the dataset in the format of (rows, columns)


print("The shape is: ")
data.shape

The shape is:

[6]: (7845, 24)

[7]: # Names of columns


print("The columns present in the dataset are: ")
data.columns

The columns present in the dataset are:

[7]: Index(['Updated On', 'State', 'Total Doses Administered', 'Sessions',


' Sites ', 'First Dose Administered', 'Second Dose Administered',
'Male (Doses Administered)', 'Female (Doses Administered)',
'Transgender (Doses Administered)', ' Covaxin (Doses Administered)',
'CoviShield (Doses Administered)', 'Sputnik V (Doses Administered)',
'AEFI', '18-44 Years (Doses Administered)',
'45-60 Years (Doses Administered)', '60+ Years (Doses Administered)',
'18-44 Years(Individuals Vaccinated)',
'45-60 Years(Individuals Vaccinated)',
'60+ Years(Individuals Vaccinated)', 'Male(Individuals Vaccinated)',
'Female(Individuals Vaccinated)', 'Transgender(Individuals Vaccinated)',

4
'Total Individuals Vaccinated'],
dtype='object')

0.0.1 Describe the dataset


To describe the dataset, we use describe() function. It gives the output as mean, maximum, mini-
mum, count etc.

[8]: data.describe()

[8]: Total Doses Administered Sessions Sites \


count 7.621000e+03 7.621000e+03 7621.000000
mean 9.188171e+06 4.792358e+05 2282.872064
std 3.746180e+07 1.911511e+06 7275.973730
min 7.000000e+00 0.000000e+00 0.000000
25% 1.356570e+05 6.004000e+03 69.000000
50% 8.182020e+05 4.547000e+04 597.000000
75% 6.625243e+06 3.428690e+05 1708.000000
max 5.132284e+08 3.501031e+07 73933.000000

First Dose Administered Second Dose Administered \


count 7.621000e+03 7.621000e+03
mean 7.414415e+06 1.773755e+06
std 2.995209e+07 7.570382e+06
min 7.000000e+00 0.000000e+00
25% 1.166320e+05 1.283100e+04
50% 6.614590e+05 1.388180e+05
75% 5.387805e+06 1.166434e+06
max 4.001504e+08 1.130780e+08

Male (Doses Administered) Female (Doses Administered) \


count 7.461000e+03 7.461000e+03
mean 3.620156e+06 3.168416e+06
std 1.737938e+07 1.515310e+07
min 0.000000e+00 2.000000e+00
25% 5.655500e+04 5.210700e+04
50% 3.897850e+05 3.342380e+05
75% 2.735777e+06 2.561513e+06
max 2.701636e+08 2.395186e+08

Transgender (Doses Administered) Covaxin (Doses Administered) \


count 7461.000000 7.621000e+03
mean 1162.978019 1.044669e+06
std 5931.353995 4.452259e+06
min 0.000000 0.000000e+00
25% 8.000000 0.000000e+00
50% 113.000000 1.185100e+04
75% 800.000000 7.579300e+05

5
max 98275.000000 6.236742e+07

CoviShield (Doses Administered) ... 18-44 Years (Doses Administered) \


count 7.621000e+03 ... 1.702000e+03
mean 8.126553e+06 ... 8.773958e+06
std 3.298414e+07 ... 2.660829e+07
min 7.000000e+00 ... 2.662400e+04
25% 1.331340e+05 ... 4.344842e+05
50% 7.567360e+05 ... 3.095970e+06
75% 6.007817e+06 ... 7.366241e+06
max 4.468251e+08 ... 2.243304e+08

45-60 Years (Doses Administered) 60+ Years (Doses Administered) \


count 1.702000e+03 1.702000e+03
mean 7.442161e+06 5.641605e+06
std 2.225999e+07 1.681650e+07
min 1.681500e+04 9.994000e+03
25% 2.326275e+05 1.285605e+05
50% 2.695938e+06 1.805696e+06
75% 6.969726e+06 5.294763e+06
max 1.667575e+08 1.186927e+08

18-44 Years(Individuals Vaccinated) \


count 3.733000e+03
mean 1.395895e+06
std 5.501454e+06
min 1.059000e+03
25% 5.655400e+04
50% 2.947270e+05
75% 9.105160e+05
max 9.224315e+07

45-60 Years(Individuals Vaccinated) 60+ Years(Individuals Vaccinated) \


count 3.734000e+03 3.734000e+03
mean 2.916515e+06 2.627444e+06
std 9.567607e+06 8.192225e+06
min 1.136000e+03 5.580000e+02
25% 9.248225e+04 5.615975e+04
50% 8.330395e+05 7.887425e+05
75% 2.499280e+06 2.337874e+06
max 9.096888e+07 6.731098e+07

Male(Individuals Vaccinated) Female(Individuals Vaccinated) \


count 1.600000e+02 1.600000e+02
mean 4.461687e+07 3.951018e+07
std 3.950749e+07 3.417684e+07
min 2.375700e+04 2.451700e+04

6
25% 5.739350e+06 5.023407e+06
50% 3.716590e+07 3.365402e+07
75% 7.441663e+07 6.685368e+07
max 1.349420e+08 1.156684e+08

Transgender(Individuals Vaccinated) Total Individuals Vaccinated


count 160.000000 5.919000e+03
mean 12370.543750 4.547842e+06
std 12485.026753 1.834182e+07
min 2.000000 7.000000e+00
25% 1278.750000 7.427550e+04
50% 8007.500000 4.022880e+05
75% 19851.000000 3.501562e+06
max 46462.000000 2.506569e+08

[8 rows x 22 columns]

[9]: data.describe(include='object')

[9]: Updated On State


count 7845 7845
unique 213 37
top 16/01/2021 Delhi
freq 37 213

[10]: # Information about the dataset


data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7845 entries, 0 to 7844
Data columns (total 24 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Updated On 7845 non-null object
1 State 7845 non-null object
2 Total Doses Administered 7621 non-null float64
3 Sessions 7621 non-null float64
4 Sites 7621 non-null float64
5 First Dose Administered 7621 non-null float64
6 Second Dose Administered 7621 non-null float64
7 Male (Doses Administered) 7461 non-null float64
8 Female (Doses Administered) 7461 non-null float64
9 Transgender (Doses Administered) 7461 non-null float64
10 Covaxin (Doses Administered) 7621 non-null float64
11 CoviShield (Doses Administered) 7621 non-null float64
12 Sputnik V (Doses Administered) 2995 non-null float64
13 AEFI 5438 non-null float64
14 18-44 Years (Doses Administered) 1702 non-null float64

7
15 45-60 Years (Doses Administered) 1702 non-null float64
16 60+ Years (Doses Administered) 1702 non-null float64
17 18-44 Years(Individuals Vaccinated) 3733 non-null float64
18 45-60 Years(Individuals Vaccinated) 3734 non-null float64
19 60+ Years(Individuals Vaccinated) 3734 non-null float64
20 Male(Individuals Vaccinated) 160 non-null float64
21 Female(Individuals Vaccinated) 160 non-null float64
22 Transgender(Individuals Vaccinated) 160 non-null float64
23 Total Individuals Vaccinated 5919 non-null float64
dtypes: float64(22), object(2)
memory usage: 1.4+ MB

[11]: data.isnull().sum()

[11]: Updated On 0
State 0
Total Doses Administered 224
Sessions 224
Sites 224
First Dose Administered 224
Second Dose Administered 224
Male (Doses Administered) 384
Female (Doses Administered) 384
Transgender (Doses Administered) 384
Covaxin (Doses Administered) 224
CoviShield (Doses Administered) 224
Sputnik V (Doses Administered) 4850
AEFI 2407
18-44 Years (Doses Administered) 6143
45-60 Years (Doses Administered) 6143
60+ Years (Doses Administered) 6143
18-44 Years(Individuals Vaccinated) 4112
45-60 Years(Individuals Vaccinated) 4111
60+ Years(Individuals Vaccinated) 4111
Male(Individuals Vaccinated) 7685
Female(Individuals Vaccinated) 7685
Transgender(Individuals Vaccinated) 7685
Total Individuals Vaccinated 1926
dtype: int64

As there are many NULL values present in the given dataset. We need to replace those values by
mean(in case of numerical data) or mode(in case of categorical data). Here, we need to work on
“First Dose Administered” and “Second Dose Administered”. Both of them are float, hence we will
replace the Nan Values by mean(average).

For First Dose Administered


[12]: # Average of First Dose Administered
avg_firstdose = data["First Dose Administered"].astype("float").mean(axis = 0)

8
print("Average of First Dose:", avg_firstdose)

Average of First Dose: 7414415.300354284

[13]: # Replacing First Dose Administered


data["First Dose Administered"].fillna(value = avg_firstdose, inplace=True)
data

C:\Users\gunsm\AppData\Local\Temp\ipykernel_28768\3112630467.py:2:
FutureWarning: A value is trying to be set on a copy of a DataFrame or Series
through chained assignment using an inplace method.
The behavior will change in pandas 3.0. This inplace method will never work
because the intermediate object on which we are setting values always behaves as
a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using


'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value)
instead, to perform the operation inplace on the original object.

data["First Dose Administered"].fillna(value = avg_firstdose, inplace=True)

[13]: Updated On State Total Doses Administered Sessions Sites \


0 16/01/2021 India 48276.0 3455.0 2957.0
1 17/01/2021 India 58604.0 8532.0 4954.0
2 18/01/2021 India 99449.0 13611.0 6583.0
3 19/01/2021 India 195525.0 17855.0 7951.0
4 20/01/2021 India 251280.0 25472.0 10504.0
... ... ... ... ... ...
7840 11/08/2021 West Bengal NaN NaN NaN
7841 12/08/2021 West Bengal NaN NaN NaN
7842 13/08/2021 West Bengal NaN NaN NaN
7843 14/08/2021 West Bengal NaN NaN NaN
7844 15/08/2021 West Bengal NaN NaN NaN

First Dose Administered Second Dose Administered \


0 4.827600e+04 0.0
1 5.860400e+04 0.0
2 9.944900e+04 0.0
3 1.955250e+05 0.0
4 2.512800e+05 0.0
... ... ...
7840 7.414415e+06 NaN
7841 7.414415e+06 NaN
7842 7.414415e+06 NaN
7843 7.414415e+06 NaN
7844 7.414415e+06 NaN

9
Male (Doses Administered) Female (Doses Administered) \
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN
... ... ...
7840 NaN NaN
7841 NaN NaN
7842 NaN NaN
7843 NaN NaN
7844 NaN NaN

Transgender (Doses Administered) ... 18-44 Years (Doses Administered) \


0 NaN ... NaN
1 NaN ... NaN
2 NaN ... NaN
3 NaN ... NaN
4 NaN ... NaN
... ... ... ...
7840 NaN ... NaN
7841 NaN ... NaN
7842 NaN ... NaN
7843 NaN ... NaN
7844 NaN ... NaN

45-60 Years (Doses Administered) 60+ Years (Doses Administered) \


0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN
... ... ...
7840 NaN NaN
7841 NaN NaN
7842 NaN NaN
7843 NaN NaN
7844 NaN NaN

18-44 Years(Individuals Vaccinated) \


0 NaN
1 NaN
2 NaN
3 NaN
4 NaN
... ...
7840 NaN

10
7841 NaN
7842 NaN
7843 NaN
7844 NaN

45-60 Years(Individuals Vaccinated) 60+ Years(Individuals Vaccinated) \


0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN
... ... ...
7840 NaN NaN
7841 NaN NaN
7842 NaN NaN
7843 NaN NaN
7844 NaN NaN

Male(Individuals Vaccinated) Female(Individuals Vaccinated) \


0 23757.0 24517.0
1 27348.0 31252.0
2 41361.0 58083.0
3 81901.0 113613.0
4 98111.0 153145.0
... ... ...
7840 NaN NaN
7841 NaN NaN
7842 NaN NaN
7843 NaN NaN
7844 NaN NaN

Transgender(Individuals Vaccinated) Total Individuals Vaccinated


0 2.0 48276.0
1 4.0 58604.0
2 5.0 99449.0
3 11.0 195525.0
4 24.0 251280.0
... ... ...
7840 NaN NaN
7841 NaN NaN
7842 NaN NaN
7843 NaN NaN
7844 NaN NaN

[7845 rows x 24 columns]

For Second Dose Administered

11
[14]: # Average of Second Dose Administered
avg_seconddose = data["Second Dose Administered"].astype("float").mean(axis = 0)
print("Average of Second Dose:", avg_seconddose)

Average of Second Dose: 1773755.2436688098

[15]: # Replacing Second Dose Administered


data["Second Dose Administered"].fillna(value = avg_seconddose, inplace = True)
data

C:\Users\gunsm\AppData\Local\Temp\ipykernel_28768\81232257.py:2: FutureWarning:
A value is trying to be set on a copy of a DataFrame or Series through chained
assignment using an inplace method.
The behavior will change in pandas 3.0. This inplace method will never work
because the intermediate object on which we are setting values always behaves as
a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using


'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value)
instead, to perform the operation inplace on the original object.

data["Second Dose Administered"].fillna(value = avg_seconddose, inplace =


True)

[15]: Updated On State Total Doses Administered Sessions Sites \


0 16/01/2021 India 48276.0 3455.0 2957.0
1 17/01/2021 India 58604.0 8532.0 4954.0
2 18/01/2021 India 99449.0 13611.0 6583.0
3 19/01/2021 India 195525.0 17855.0 7951.0
4 20/01/2021 India 251280.0 25472.0 10504.0
... ... ... ... ... ...
7840 11/08/2021 West Bengal NaN NaN NaN
7841 12/08/2021 West Bengal NaN NaN NaN
7842 13/08/2021 West Bengal NaN NaN NaN
7843 14/08/2021 West Bengal NaN NaN NaN
7844 15/08/2021 West Bengal NaN NaN NaN

First Dose Administered Second Dose Administered \


0 4.827600e+04 0.000000e+00
1 5.860400e+04 0.000000e+00
2 9.944900e+04 0.000000e+00
3 1.955250e+05 0.000000e+00
4 2.512800e+05 0.000000e+00
... ... ...
7840 7.414415e+06 1.773755e+06
7841 7.414415e+06 1.773755e+06
7842 7.414415e+06 1.773755e+06

12
7843 7.414415e+06 1.773755e+06
7844 7.414415e+06 1.773755e+06

Male (Doses Administered) Female (Doses Administered) \


0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN
... ... ...
7840 NaN NaN
7841 NaN NaN
7842 NaN NaN
7843 NaN NaN
7844 NaN NaN

Transgender (Doses Administered) ... 18-44 Years (Doses Administered) \


0 NaN ... NaN
1 NaN ... NaN
2 NaN ... NaN
3 NaN ... NaN
4 NaN ... NaN
... ... ... ...
7840 NaN ... NaN
7841 NaN ... NaN
7842 NaN ... NaN
7843 NaN ... NaN
7844 NaN ... NaN

45-60 Years (Doses Administered) 60+ Years (Doses Administered) \


0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN
... ... ...
7840 NaN NaN
7841 NaN NaN
7842 NaN NaN
7843 NaN NaN
7844 NaN NaN

18-44 Years(Individuals Vaccinated) \


0 NaN
1 NaN
2 NaN
3 NaN

13
4 NaN
... ...
7840 NaN
7841 NaN
7842 NaN
7843 NaN
7844 NaN

45-60 Years(Individuals Vaccinated) 60+ Years(Individuals Vaccinated) \


0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN
... ... ...
7840 NaN NaN
7841 NaN NaN
7842 NaN NaN
7843 NaN NaN
7844 NaN NaN

Male(Individuals Vaccinated) Female(Individuals Vaccinated) \


0 23757.0 24517.0
1 27348.0 31252.0
2 41361.0 58083.0
3 81901.0 113613.0
4 98111.0 153145.0
... ... ...
7840 NaN NaN
7841 NaN NaN
7842 NaN NaN
7843 NaN NaN
7844 NaN NaN

Transgender(Individuals Vaccinated) Total Individuals Vaccinated


0 2.0 48276.0
1 4.0 58604.0
2 5.0 99449.0
3 11.0 195525.0
4 24.0 251280.0
... ... ...
7840 NaN NaN
7841 NaN NaN
7842 NaN NaN
7843 NaN NaN
7844 NaN NaN

14
[7845 rows x 24 columns]

This data is ready to be used for the next questions

0.0.2 Number of persons state wise vaccinated for first dose in India

[16]: first_dose = data.groupby('State')[['First Dose Administered']].sum()


first_dose

[16]: First Dose Administered


State
Andaman and Nicobar Islands 6.091235e+07
Andhra Pradesh 1.277347e+09
Arunachal Pradesh 9.349147e+07
Assam 6.300867e+08
Bihar 1.514989e+09
Chandigarh 8.918960e+07
Chhattisgarh 8.404894e+08
Dadra and Nagar Haveli and Daman and Diu 8.549597e+07
Delhi 6.762404e+08
Goa 1.204779e+08
Gujarat 2.176133e+09
Haryana 8.002848e+08
Himachal Pradesh 3.607805e+08
India 2.830663e+10
Jammu and Kashmir 4.545883e+08
Jharkhand 6.481602e+08
Karnataka 1.917816e+09
Kerala 1.238332e+09
Ladakh 6.229574e+07
Lakshadweep 4.885015e+07
Madhya Pradesh 1.841091e+09
Maharashtra 2.828851e+09
Manipur 1.118961e+08
Meghalaya 1.071025e+08
Mizoram 9.235957e+07
Nagaland 8.689726e+07
Odisha 1.077120e+09
Puducherry 8.583335e+07
Punjab 6.288331e+08
Rajasthan 2.245531e+09
Sikkim 8.146742e+07
Tamil Nadu 1.333019e+09
Telangana 9.248071e+08
Tripura 2.371762e+08
Uttar Pradesh 2.832898e+09
Uttarakhand 4.076779e+08
West Bengal 1.840936e+09

15
0.0.3 Number of persons state wise vaccinated for second dose in India

[17]: first_dose = data.groupby('State')[['Second Dose Administered']].sum()


first_dose

[17]: Second Dose Administered


State
Andaman and Nicobar Islands 1.476109e+07
Andhra Pradesh 3.694601e+08
Arunachal Pradesh 2.257485e+07
Assam 1.414313e+08
Bihar 2.814331e+08
Chandigarh 2.223627e+07
Chhattisgarh 1.827629e+08
Dadra and Nagar Haveli and Daman and Diu 1.701070e+07
Delhi 2.006352e+08
Goa 2.684071e+07
Gujarat 6.110609e+08
Haryana 1.692986e+08
Himachal Pradesh 8.448111e+07
India 6.770264e+09
Jammu and Kashmir 9.659418e+07
Jharkhand 1.327636e+08
Karnataka 4.378297e+08
Kerala 3.746913e+08
Ladakh 1.609629e+07
Lakshadweep 1.169898e+07
Madhya Pradesh 3.275755e+08
Maharashtra 7.235236e+08
Manipur 2.250068e+07
Meghalaya 2.280916e+07
Mizoram 2.064095e+07
Nagaland 1.984717e+07
Odisha 2.619453e+08
Puducherry 1.925139e+07
Punjab 1.317635e+08
Rajasthan 5.023455e+08
Sikkim 2.036617e+07
Tamil Nadu 3.013132e+08
Telangana 2.087955e+08
Tripura 7.591267e+07
Uttar Pradesh 5.650776e+08
Uttarakhand 1.107276e+08
West Bengal 5.967894e+08

16
0.0.4 Number of Males vaccinated
[18]: male = data["Male(Individuals Vaccinated)"].sum()
print("The total number of male individuals vaccinated are", int(male))

The total number of male individuals vaccinated are 7138698858

0.0.5 Number of females vaccinated


[19]: female = data["Female(Individuals Vaccinated)"].sum()
print("The total number of female individuals vaccinated are", int(female))

The total number of female individuals vaccinated are 6321628736

[20]: # Convert the 'Updated On' column to datetime (adjust the format if necessary)
data["Updated On"] = pd.to_datetime(data["Updated On"], format="%d/%m/%Y",␣
,→errors='coerce')

[21]: # Convert 'Updated On' column to datetime format


data['Updated On'] = pd.to_datetime(data['Updated On'])

# Group by date and sum doses administered


daily_vaccine_data = data.groupby('Updated On')['Total Doses Administered'].sum()

# Plot the line graph


plt.figure(figsize=(12, 6))
sns.lineplot(x=daily_vaccine_data.index, y=daily_vaccine_data.values)
plt.xticks(rotation=45)
plt.xlabel("Date")
plt.ylabel("Total Doses Administered")
plt.title("Total Vaccine Doses Administered Over Time")
plt.show()

17
[22]: # Group by state and sum total doses
state_vaccine_data = data.groupby('State')['Total Doses Administered'].sum().
,→sort_values(ascending=False)

# Plot the bar chart


plt.figure(figsize=(12, 6))
sns.barplot(x=state_vaccine_data.head(15).index, y=state_vaccine_data.head(15).
,→values, palette="coolwarm")

plt.xticks(rotation=90)
plt.xlabel("State")
plt.ylabel("Total Doses Administered")
plt.title("Top 15 States by Vaccine Doses Administered")
plt.show()

C:\Users\gunsm\AppData\Local\Temp\ipykernel_28768\1155984545.py:6:
FutureWarning:

Passing `palette` without assigning `hue` is deprecated and will be removed in


v0.14.0. Assign the `x` variable to `hue` and set `legend=False` for the same
effect.

sns.barplot(x=state_vaccine_data.head(15).index,
y=state_vaccine_data.head(15).values, palette="coolwarm")

18
[23]: # Group by state and sum doses
dose_comparison = data.groupby('State')[['First Dose Administered', 'Second Dose␣
,→Administered']].sum()

# Plot stacked bar chart


dose_comparison.head(15).plot(kind='bar', stacked=True, figsize=(12, 6),␣
,→colormap='viridis')

plt.xlabel("State")
plt.ylabel("Doses Administered")
plt.title("First vs Second Dose Administered by State")
plt.xticks(rotation=90)
plt.legend(["First Dose", "Second Dose"])
plt.show()

19
[24]: # Sum gender-wise vaccination
gender_vaccine_data = data[['Male (Doses Administered)', 'Female (Doses␣
,→Administered)', 'Transgender (Doses Administered)']].sum()

# Plot pie chart


plt.figure(figsize=(8, 8))
plt.pie(gender_vaccine_data, labels=gender_vaccine_data.index, autopct='%1.
,→1f%%', colors=['lightblue', 'pink', 'purple'])

plt.title("Gender-wise Distribution of Vaccine Doses")


plt.show()

20
[25]: # Sum up vaccination by age group
age_group_vaccine_data = data[['18-44 Years (Doses Administered)', '45-60 Years␣
,→(Doses Administered)', '60+ Years (Doses Administered)']].sum()

# Plot bar chart


plt.figure(figsize=(8, 6))
sns.barplot(x=age_group_vaccine_data.index, y=age_group_vaccine_data.values,␣
,→palette="magma")

plt.xlabel("Age Group")
plt.ylabel("Doses Administered")
plt.title("Vaccination by Age Group")
plt.show()

C:\Users\gunsm\AppData\Local\Temp\ipykernel_28768\1697523184.py:6:
FutureWarning:

Passing `palette` without assigning `hue` is deprecated and will be removed in


v0.14.0. Assign the `x` variable to `hue` and set `legend=False` for the same
effect.

21
sns.barplot(x=age_group_vaccine_data.index, y=age_group_vaccine_data.values,
palette="magma")

[ ]:

22

You might also like