0% found this document useful (0 votes)
4 views

Assignment1_param - converted

PRACTICAL PACKET CAPTURING USING Whireshark TEMPLATE

Uploaded by

paramdholakia3
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Assignment1_param - converted

PRACTICAL PACKET CAPTURING USING Whireshark TEMPLATE

Uploaded by

paramdholakia3
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

AIML 202046702

ASSIGNMENT-1
importing required libraries
In [1]: import pandas as pd
import matplotlib.pyplot as plt

reading the dataset


In [2]: df=pd.read_csv('amazon.csv')

1. Display Top 5 Rows of The Dataset.


In [3]: df.head(5)

Out[3]: year state month number date

0 1998 Acre Janeiro 0.0 1998-01-01

1 1999 Acre Janeiro 0.0 1999-01-01

2 2000 Acre Janeiro 0.0 2000-01-01

3 2001 Acre Janeiro 0.0 2001-01-01

4 2002 Acre Janeiro 0.0 2002-01-01

2. Check Last 5 Rows.


In [4]: df.tail(5)

Out[4]: year state month number date

6449 2012 Tocantins Dezembro 128.0 2012-01-01

6450 2013 Tocantins Dezembro 85.0 2013-01-01

6451 2014 Tocantins Dezembro 223.0 2014-01-01

6452 2015 Tocantins Dezembro 373.0 2015-01-01

6453 2016 Tocantins Dezembro 119.0 2016-01-01

3. Find Shape of Our Dataset (Number of Rows


and Number of Columns).
In [5]: print('No. of rows: ',df.shape[0])
print('No. of columns: ',df.shape[1])

12202040501049 PARAM H DHOLAKIA


AIML 202046702

No. of rows: 6454


No. of columns: 5

4. Getting Information About Our Dataset Like


Total Number Rows, Total Number of Columns,
Datatypes of Each Column and Memory
Requirement.
In [6]: df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6454 entries, 0 to 6453
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 year 6454 non-null int64
1 state 6454 non-null object
2 month 6454 non-null object
3 number 6454 non-null float64
4 date 6454 non-null object
dtypes: float64(1), int64(1), object(3)
memory usage: 252.2+ KB

5. Check For Duplicate Data and Drop Them.


In [7]: df.columns

Out[7]: Index(['year', 'state', 'month', 'number', 'date'], dtype='object')

In [8]: duplicate=df[df.duplicated()]
duplicate

12202040501049 PARAM H DHOLAKIA


AIML 202046702

Out[8]: year state month number date

259 2017 Alagoas Janeiro 38.0 2017-01-01

2630 1998 Mato Grosso Janeiro 0.0 1998-01-01

2650 1998 Mato Grosso Fevereiro 0.0 1998-01-01

2670 1998 Mato Grosso Março 0.0 1998-01-01

2690 1998 Mato Grosso Abril 0.0 1998-01-01

2710 1998 Mato Grosso Maio 0.0 1998-01-01

3586 1998 Paraiba Janeiro 0.0 1998-01-01

3606 1998 Paraiba Fevereiro 0.0 1998-01-01

3621 2013 Paraiba Fevereiro 9.0 2013-01-01

3626 1998 Paraiba Março 0.0 1998-01-01

3646 1998 Paraiba Abril 0.0 1998-01-01

3666 1998 Paraiba Maio 0.0 1998-01-01

4542 1998 Rio Janeiro 0.0 1998-01-01

4562 1998 Rio Fevereiro 0.0 1998-01-01

4582 1998 Rio Março 0.0 1998-01-01

4585 2001 Rio Março 0.0 2001-01-01

4590 2006 Rio Março 8.0 2006-01-01

4602 1998 Rio Abril 0.0 1998-01-01

4608 2004 Rio Abril 3.0 2004-01-01

4613 2009 Rio Abril 1.0 2009-01-01

4622 1998 Rio Maio 0.0 1998-01-01

4631 2007 Rio Maio 2.0 2007-01-01

4632 2008 Rio Maio 0.0 2008-01-01

4645 2001 Rio Junho 13.0 2001-01-01

4781 1998 Rio Janeiro 0.0 1998-01-01

4800 2017 Rio Janeiro 28.0 2017-01-01

4801 1998 Rio Fevereiro 0.0 1998-01-01

4821 1998 Rio Março 0.0 1998-01-01

4841 1998 Rio Abril 0.0 1998-01-01

4861 1998 Rio Maio 0.0 1998-01-01

4864 2001 Rio Maio 4.0 2001-01-01

4910 2007 Rio Julho 7.0 2007-01-01

12202040501049 PARAM H DHOLAKIA


AIML 202046702

In [9]: df=df.drop_duplicates()

In [10]: df

Out[10]: year state month number date

0 1998 Acre Janeiro 0.0 1998-01-01

1 1999 Acre Janeiro 0.0 1999-01-01

2 2000 Acre Janeiro 0.0 2000-01-01

3 2001 Acre Janeiro 0.0 2001-01-01

4 2002 Acre Janeiro 0.0 2002-01-01

... ... ... ... ... ...

6449 2012 Tocantins Dezembro 128.0 2012-01-01

6450 2013 Tocantins Dezembro 85.0 2013-01-01

6451 2014 Tocantins Dezembro 223.0 2014-01-01

6452 2015 Tocantins Dezembro 373.0 2015-01-01

6453 2016 Tocantins Dezembro 119.0 2016-01-01

6422 rows × 5 columns

6. Check Null Values in The Dataset.


In [11]: #checks for total no.of null values for each column
df.isna().sum()

Out[11]: year 0
state 0
month 0
number 0
date 0
dtype: int64

7. Get Overall Statistics About the Dataframe.


In [12]: df.describe()

12202040501049 PARAM H DHOLAKIA


AIML 202046702

Out[12]: year number

count 6422.000000 6422.000000

mean 2007.490969 108.815178

std 5.731806 191.142482

min 1998.000000 0.000000

25% 2003.000000 3.000000

50% 2007.000000 24.497000

75% 2012.000000 114.000000

max 2017.000000 998.000000

8. Rename Month Names to English.


In [13]: df['month'].unique()

Out[13]: array(['Janeiro', 'Fevereiro', 'Março', 'Abril', 'Maio', 'Junho', 'Julho',


'Agosto', 'Setembro', 'Outubro', 'Novembro', 'Dezembro'],
dtype=object)

In [14]: month_map={'Janeiro':'January','Fevereiro':'February','Março':'March','Abril':'A
'Agosto':'August', 'Setembro':'September', 'Outubro':'October', 'Novembro

In [15]: df['month']=df['month'].map(month_map)
df['month'].unique()

Out[15]: array(['January', 'February', 'March', 'April', 'May', 'June', 'July',


'August', 'September', 'October', 'November', 'December'],
dtype=object)

9. Total Number of Fires Registered.


In [16]: print('Total fires registered: ',df.shape[0])

Total fires registered: 6422

10.In Which Month Maximum Number of Forest


Fires Were Reported?
In [17]: df.columns

Out[17]: Index(['year', 'state', 'month', 'number', 'date'], dtype='object')

In [18]: no_of_cases=df.groupby('month')['number'].sum().sort_values(ascending=False).ind
print(no_of_cases[0],' is the month with highest no. of cases')

July is the month with highest no. of cases

12202040501049 PARAM H DHOLAKIA


AIML 202046702

11.In Which Year Maximum Number of Forest Fires


Was Reported?
In [19]: no_of_cases=df.groupby('year')['number'].sum().sort_values(ascending=False).inde
print(no_of_cases[0],' is the year with highest no. of cases')

2003 is the year with highest no. of cases

12.In Which State Maximum Number of Forest


Fires Was Reported?
In [20]: no_of_cases=df.groupby('state')['number'].sum().sort_values(ascending=False).ind
print(no_of_cases[0],' is the state with highest no. of cases')

Mato Grosso is the state with highest no. of cases

13.Find Total Number of Fires Were Reported in


Amazonas.
In [21]: df.columns

Out[21]: Index(['year', 'state', 'month', 'number', 'date'], dtype='object')

In [22]: #extraxt rows with state Amazonas


df2=df[df['state']=='Amazonas']

In [23]: print("Total number of forest fires in Amazonas:",df2['number'].sum()) #Get tota

Total number of forest fires in Amazonas: 30650.129

14.Display Number of Fires Were Reported in


Amazonas (Year-Wise).
In [24]: df.columns

Out[24]: Index(['year', 'state', 'month', 'number', 'date'], dtype='object')

In [25]: df3=df[df['state']=='Amazonas'].groupby('year')['number'].sum()
df3

12202040501049 PARAM H DHOLAKIA


AIML 202046702

Out[25]: year
1998 946.000
1999 1061.000
2000 853.000
2001 1297.000
2002 2852.000
2003 1524.268
2004 2298.207
2005 1657.128
2006 997.640
2007 589.601
2008 2717.000
2009 1320.601
2010 2324.508
2011 1652.538
2012 1110.641
2013 905.217
2014 2385.909
2015 1189.994
2016 2060.972
2017 906.905
Name: number, dtype: float64

15.Display Number of Fires Were Reported in


Amazonas (Day-Wise).
In [26]: #extract rows with state amazonas
df2=df[df['state']=='Amazonas']

In [27]: #convert date column to date-time format


df2['date'] = pd.to_datetime(df2['date'])
df3=df2.groupby(df2['date'].dt.dayofweek)['number'].sum()

C:\Users\PARAM\AppData\Local\Temp\ipykernel_8680\3119725923.py:2: SettingWithCopy
Warning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stabl


e/user_guide/indexing.html#returning-a-view-versus-a-copy
df2['date'] = pd.to_datetime(df2['date'])

In [28]: dict = {0: 'Sunday',1: 'Monday',2: 'Tuesday',3: 'Wednesday',4: 'Thursday',5: 'Fr

In [29]: #map numeric day to names of day


df3.index = df3.index.map(dict)

In [30]: df3

12202040501049 PARAM H DHOLAKIA


AIML 202046702

Out[30]: date
Sunday 1886.601
Monday 6474.217
Tuesday 3910.177
Wednesday 5754.802
Thursday 5446.480
Friday 4162.666
Saturday 3015.186
Name: number, dtype: float64

16.Find Total Number of Fires Were Reported In


2015 And Visualize Data Based on Each ‘Month’.
In [31]: #total fire reports in each month for 2015
df2=df[df['year']==2015].groupby('month')['number'].sum().reset_index()

In [32]: df2

Out[32]: month number

0 April 2573.000

1 August 4363.125

2 December 4088.522

3 February 2309.000

4 January 4635.000

5 July 4364.392

6 June 3260.552

7 March 2202.000

8 May 2384.000

9 November 4034.518

10 October 4499.525

11 September 2494.658

In [33]: plt.figure(figsize=(20, 5)) #to ensure image readability


plt.bar(df2['month'],df2['number'])
plt.show()

12202040501049 PARAM H DHOLAKIA


AIML 202046702

17.Find Average Number of Fires Were Reported


from Highest to Lowest (State-Wise).
In [34]: #Group the data by state and find average reports state-wise
df2=df.groupby('state')['number'].mean().reset_index()

In [35]: #sort values from highest to lowest average


df2.sort_values('number',ascending=False)

Out[35]: state number

20 Sao Paulo 213.896226

10 Mato Grosso 203.479975

4 Bahia 187.222703

15 Piau 158.174674

8 Goias 157.721841

11 Minas Gerais 156.800243

22 Tocantins 141.037176

3 Amazonas 128.243218

5 Ceara 127.314071

12 Paraiba 111.073979

9 Maranhao 105.142808

13 Pará 102.561272

14 Pernambuco 102.502092

18 Roraima 102.029598

19 Santa Catarina 101.924067

2 Amapa 91.345506

17 Rondonia 84.876272

0 Acre 77.255356

16 Rio 64.698515

7 Espirito Santo 27.389121

1 Alagoas 19.271967

6 Distrito Federal 14.899582

21 Sergipe 13.543933

18.To Find the State Names Where Fires Were


Reported In 'dec' Month.

12202040501049 PARAM H DHOLAKIA


AIML 202046702

In [36]: states=df[df['month']=='December']['state'].unique()

In [37]: print("List of states:")


for i in states:
print(i)

List of states:
Acre
Alagoas
Amapa
Amazonas
Bahia
Ceara
Distrito Federal
Espirito Santo
Goias
Maranhao
Mato Grosso
Minas Gerais
Pará
Paraiba
Pernambuco
Piau
Rio
Rondonia
Roraima
Santa Catarina
Sao Paulo
Sergipe
Tocantins

12202040501049 PARAM H DHOLAKIA

You might also like