0% found this document useful (0 votes)
13 views9 pages

Payal - 2 - Practical (1) - Edited

The document contains a practical assignment by Arati Dilip Kadam, focusing on data analysis using Python libraries such as NumPy, Pandas, Matplotlib, and Seaborn. It involves reading a CSV file related to academic performance, checking for missing values, and performing data cleaning operations like dropping null values. The output includes various data manipulations and the resulting cleaned datasets.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views9 pages

Payal - 2 - Practical (1) - Edited

The document contains a practical assignment by Arati Dilip Kadam, focusing on data analysis using Python libraries such as NumPy, Pandas, Matplotlib, and Seaborn. It involves reading a CSV file related to academic performance, checking for missing values, and performing data cleaning operations like dropping null values. The output includes various data manipulations and the resulting cleaned datasets.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Name : Arati Dilip Kadam

Roll No: COTA29

Practical 2

In [1]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

In [2]: sp = pd.read_csv("/home/student/Desktop/Dataset/AcademicPerformanceNEW.csv"

In [3]: sp.isnull()

Out[3]: math reading writing placement club join placement


gender region
score score score score year count

0 False False False False False False False False

1 False False False False False False False False

2 False False False False False False False False

3 False False False False False False False False

4 False False False False False False False False

5 True False True False False True False False

6 False False False False False False False False

7 False False False False False False False False

8 False False False True False False False False

9 False False False False False False False False

10 False False False False False False False True

11 True False False False False False False False

12 False False False False False True False False

13 False False False False False False False False

14 False False False False False False False False

15 False False False False False False False False

16 False False False False False False False False

17 False False True False False False False False

18 True False False False False False False False

19 False False False False False False False True

20 False False False False False False False False

21 False False False False False False False False

22 False False False False False True False False

23 False False False False False False False False

24 False False False False False False False False

25 False False False False False False False False


In [6]: series = pd.isnull(sp["math score"])
sp[series]

Out[6]: math reading writing placement club join placement


gender region
score score score score year count

In [7]: sp.notnull()

Out[7]: math reading writing placement club join placement


gender region
score score score score year count

0 True True True True True True True True

1 True True True True True True True True

2 True True True True True True True True

3 True True True True True True True True

4 True True True True True True True True

5 False True False True True False True True

6 True True True True True True True True

7 True True True True True True True True

8 True True True False True True True True

9 True True True True True True True True

10 True True True True True True True False

11 False True True True True True True True

12 True True True True True False True True

13 True True True True True True True True

14 True True True True True True True True

15 True True True True True True True True

16 True True True True True True True True

17 True True False True True True True True

18 False True True True True True True True

19 True True True True True True True False

20 True True True True True True True True

21 True True True True True True True True

22 True True True True True False True True

23 True True True True True True True True

24 True True True True True True True True

25 True True True True True True True True

26 True True True True True True True True

27 True True True True True True True True

28 True True True True True True True True


In [8]: sp.dropna()

Out[8]: math reading writing placement club join placement


gender region
score score score score year count

0 Male 66 65.0 76.0 97 2020.0 3 Pune

1 Female 91 70.0 76.0 76 2019.0 2 Baramati

2 Female 72 66.0 75.0 79 2019.0 2 Satara

3 Male 99 75.0 67.0 84 2018.0 2 Karad

4 Female 62 65.0 68.0 87 2018.0 3 Mulshi

6 Female 77 100.0 72.0 93 2020.0 3 Karad

7 Male 62 80.0 74.0 96 2021.0 3 Pune

9 Female 79 64.0 101.0 85 2021.0 3 Baramati

13 Female 66 68.0 75.0 98 2020.0 3 Baramati

14 Male 80 78.0 75.0 95 2018.0 3 Pune

15 Female 72 68.0 72.0 84 2021.0 2 Mulshi

16 Male 80 66.0 73.0 79 2019.0 2 Baramati

20 Male 67 60.0 64.0 91 2021.0 3 Pune

21 Female 69 71.0 63.0 77 2019.0 2 Mulshi

23 Female 74 62.0 69.0 100 2021.0 3 Pune

24 Male 77 61.0 74.0 84 2021.0 2 Mulshi

25 Female 75 73.0 74.0 86 2021.0 3 Pune

26 Male 65 64.0 64.0 96 2021.0 3 Mulshi

27 Male 64 68.0 65.0 95 2020.0 3 Mulshi

28 Male 69 68.0 80.0 75 2019.0 2 Pune


In [9]: sp.dropna(how = 'all')

Out[9]: math reading writing placement club join placement


gender region
score score score score year count

0 Male 66 65.0 76.0 97 2020.0 3 Pune

1 Female 91 70.0 76.0 76 2019.0 2 Baramati

2 Female 72 66.0 75.0 79 2019.0 2 Satara

3 Male 99 75.0 67.0 84 2018.0 2 Karad

4 Female 62 65.0 68.0 87 2018.0 3 Mulshi

5 NaN 69 NaN 75.0 50 NaN 1 Baramati

6 Female 77 100.0 72.0 93 2020.0 3 Karad

7 Male 62 80.0 74.0 96 2021.0 3 Pune

8 Female 75 69.0 NaN 77 2019.0 2 Karad

9 Female 79 64.0 101.0 85 2021.0 3 Baramati

10 Female 61 69.0 64.0 80 2019.0 2 NaN

11 NaN 61 61.0 69.0 80 2020.0 2 Pune

12 Female 78 73.0 73.0 90 NaN 3 Karad

13 Female 66 68.0 75.0 98 2020.0 3 Baramati

14 Male 80 78.0 75.0 95 2018.0 3 Pune

15 Female 72 68.0 72.0 84 2021.0 2 Mulshi

16 Male 80 66.0 73.0 79 2019.0 2 Baramati

17 Male 60 NaN 62.0 85 2020.0 3 Pune

18 NaN 80 77.0 68.0 89 2018.0 3 Mulshi

19 Female 72 72.0 78.0 76 2018.0 2 NaN

20 Male 67 60.0 64.0 91 2021.0 3 Pune

21 Female 69 71.0 63.0 77 2019.0 2 Mulshi

22 Male 64 69.0 63.0 75 NaN 2 Baramati

23 Female 74 62.0 69.0 100 2021.0 3 Pune

24 Male 77 61.0 74.0 84 2021.0 2 Mulshi

25 Female 75 73.0 74.0 86 2021.0 3 Pune

26 Male 65 64.0 64.0 96 2021.0 3 Mulshi

27 Male 64 68.0 65.0 95 2020.0 3 Mulshi

28 Male 69 68.0 80.0 75 2019.0 2 Pune


In [10]: sp.dropna(axis=1)

Out[10]: math score placement score placement count

0 66 97 3

1 91 76 2

2 72 79 2

3 99 84 2

4 62 87 3

5 69 50 1

6 77 93 3

7 62 96 3

8 75 77 2

9 79 85 3

10 61 80 2

11 61 80 2

12 78 90 3

13 66 98 3

14 80 95 3

15 72 84 2

16 80 79 2

17 60 85 3

18 80 89 3

19 72 76 2

20 67 91 3

21 69 77 2

22 64 75 2

23 74 100 3

24 77 84 2

25 75 86 3

26 65 96 3

27 64 95 3

28 69 75 2
In [11]: new_data = sp.dropna(axis=0 , how ='any')
new_data

Out[11]: math reading writing placement club join placement


gender region
score score score score year count

0 Male 66 65.0 76.0 97 2020.0 3 Pune

1 Female 91 70.0 76.0 76 2019.0 2 Baramati

2 Female 72 66.0 75.0 79 2019.0 2 Satara

3 Male 99 75.0 67.0 84 2018.0 2 Karad

4 Female 62 65.0 68.0 87 2018.0 3 Mulshi

6 Female 77 100.0 72.0 93 2020.0 3 Karad

7 Male 62 80.0 74.0 96 2021.0 3 Pune

9 Female 79 64.0 101.0 85 2021.0 3 Baramati

13 Female 66 68.0 75.0 98 2020.0 3 Baramati

14 Male 80 78.0 75.0 95 2018.0 3 Pune

15 Female 72 68.0 72.0 84 2021.0 2 Mulshi

16 Male 80 66.0 73.0 79 2019.0 2 Baramati

20 Male 67 60.0 64.0 91 2021.0 3 Pune

21 Female 69 71.0 63.0 77 2019.0 2 Mulshi

23 Female 74 62.0 69.0 100 2021.0 3 Pune

24 Male 77 61.0 74.0 84 2021.0 2 Mulshi

25 Female 75 73.0 74.0 86 2021.0 3 Pune

26 Male 65 64.0 64.0 96 2021.0 3 Mulshi

27 Male 64 68.0 65.0 95 2020.0 3 Mulshi

28 Male 69 68.0 80.0 75 2019.0 2 Pune

In [12]: print(np.where(sp['math score']>90))

(array([1, 3]),)

In [13]: print(np.where(sp['reading score'] <25))

(array([], dtype=int64),)
In [14]: col = ['math score' , 'reading score','placement score']
sp.boxplot(col)

Out[14]: <Axes: >


In [18]: fig,axes = plt.subplots(figsize = (18,10))
axes.scatter(sp['placement score'] ,sp['placement count'])
plt.show()
In [19]: from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()

In [ ]:

You might also like