0% found this document useful (0 votes)
8 views8 pages

DSBDA02

The document contains a practical assignment by Laksh Desai, focusing on data analysis using a dataset related to academic performance. It includes code snippets for reading a CSV file, checking for null values, and dropping rows or columns with missing data. The analysis aims to clean and prepare the dataset for further examination.

Uploaded by

Laksh Desai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views8 pages

DSBDA02

The document contains a practical assignment by Laksh Desai, focusing on data analysis using a dataset related to academic performance. It includes code snippets for reading a CSV file, checking for null values, and dropping rows or columns with missing data. The analysis aims to clean and prepare the dataset for further examination.

Uploaded by

Laksh Desai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

Name : Laksh Desai

Roll_no : COTB67
Practical 2

In [1]:

sp =
pd.read_csv("/home/student/Desktop/Dataset/AcademicPerformanceNEW.cs
v"
sp.isnull()
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
In [2]:

In [3]:
Out[3]: gender math reading writing placement club join placement region score score score score year
count

0 False False False False False False False False 1 False False

False False False False False False 2 False False False False False

False False False 3 False False False False False False False False

4 False False False False False False False False

5 True False True False False True False False

6 False False False False False False False False 7 False False

False False False False False False

8 False False False True False False False False

9 False False False False False False False False 10 False False False

False False False False True 11 True False False False False False False

False

12 False False False False False True False False

13 False False False False False False False False 14 False

False False False False False False False 15 False False

False False False False False False 16 False False False

False False False False False

17 False False True False False False False False

18 True False False False False False False False 19 False

False False False False False False True 20 False False False

False False False False False 21 False False False False

False False False False


In [7]:

22 False False False False False True False False

23 False False False False False False False False 24 False

False False False False False False False 25 False False

False False False False False False 26 False False False

False False False False False

[6]: series = pd.isnull(sp["math score"])


sp[series]

gender
Out[6]: scoremath reading score writing score placementscore club yearjoin
placementcount region

In
[7]: sp.notnull()
Out[7]:
gender math reading writing placement club join placement region score score
score score year count

0 True True True True True True True True 1 True True

True True True True True True 2 True True True True True

True True True

3 True True True True True True True True 4 True True

True True True True True True

5 False True False True True False True True

6 True True True True True True True True 7 True True

True True True True True True

8 True True True False True True True True

9 True True True True True True True True

10 True True True True True True True False

11 False True True True True True True True

12 True True True True True False True True

13 True True True True True True True True 14 True True

True True True True True True 15 True True True True

True True True True 16 True True True True True True

True True

17 True True False True True True True True 18 False True True

True True True True True 19 True True True True True True

True False 20 True True True True True True True True

21 True True True True True True True True

22 True True True True True False True True

23 True True True True True True True True 24 True True

True True True True True True 25 True True True True

True True True True 26 True True True True True True
In

True True 27 True True True True True True True True 28

True True True True True True True True

sp.dropna()

Out[8]:
gender math reading writing placement club join placement region score score score score year
count

0 Male 66 65.0 76.0 97 2020.0 3 Pune

1 Female 91 70.0 76.0 76 2019.0 2 Baramati 2 Female 72

66.0 75.0 79 2019.0 2 Satara 3 Male 99 75.0

67.0 84 2018.0 2 Karad

4 Female 62 65.0 68.0 87 2018.0 3 Mulshi 6 Female 77 100.0 72.0 93 2020.0 3 Karad

7 Male 62 80.0 74.0 96 2021.0 3 Pune

9 Female 79 64.0 101.0 85 2021.0 3 Baramati

13 Female 66 68.0 75.0 98 2020.0 3 Baramati

14 Male 80 78.0 75.0 95 2018.0 3 Pune

15 Female 72 68.0 72.0 84 2021.0 2 Mulshi 16 Male 80

66.0 73.0 79 2019.0 2 Baramati

20 Male 67 60.0 64.0 91 2021.0 3 Pune

21 Female 69 71.0 63.0 77 2019.0 2 Mulshi

23 Female 74 62.0 69.0 100 2021.0 3 Pune

24 Male 77 61.0 74.0 84 2021.0 2 Mulshi

25 Female 75 73.0 74.0 86 2021.0 3 Pune 26 Male 65

64.0 64.0 96 2021.0 3 Mulshi 27 Male 64 68.0

65.0 95 2020.0 3 Mulshi

28 Male 69 68.0 80.0 75 2019.0 2 Pune


In [9]:

sp.dropna(how = 'all')

Out[9]:
gender math reading writing placement club join placement region score score score score year
count

0 Male 66 65.0 76.0 97 2020.0 3 Pune

1 Female 91 70.0 76.0 76 2019.0 2 Baramati 2 Female 72

66.0 75.0 79 2019.0 2 Satara 3 Male 99 75.0

67.0 84 2018.0 2 Karad

4 Female 62 65.0 68.0 87 2018.0 3 Mulshi

5 NaN 69 NaN 75.0 50 NaN 1 Baramati

6 Female 77 100.0 72.0 93 2020.0 3 Karad

7 Male 62 80.0 74.0 96 2021.0 3 Pune

8 Female 75 69.0 NaN 77 2019.0 2 Karad

9 Female 79 64.0 101.0 85 2021.0 3 Baramati

10 Female 61 69.0 64.0 80 2019.0 2 NaN

11 NaN 61 61.0 69.0 80 2020.0 2 Pune

12 Female 78 73.0 73.0 90 NaN 3 Karad

13 Female 66 68.0 75.0 98 2020.0 3 Baramati

14 Male 80 78.0 75.0 95 2018.0 3 Pune

15 Female 72 68.0 72.0 84 2021.0 2 Mulshi

16 Male 80 66.0 73.0 79 2019.0 2 Baramati

17 Male 60 NaN 62.0 85 2020.0 3 Pune

18 NaN 80 77.0 68.0 89 2018.0 3 Mulshi

19 Female 72 72.0 78.0 76 2018.0 2 NaN

20 Male 67 60.0 64.0 91 2021.0 3 Pune

21 Female 69 71.0 63.0 77 2019.0 2 Mulshi

22 Male 64 69.0 63.0 75 NaN 2 Baramati

23 Female 74 62.0 69.0 100 2021.0 3 Pune

24 Male 77 61.0 74.0 84 2021.0 2 Mulshi


In [10]:
25 Female 75 73.0 74.0 86 2021.0 3 Pune 26 Male 65

64.0 64.0 96 2021.0 3 Mulshi 27 Male 64 68.0

65.0 95 2020.0 3 Mulshi

28 Male 69 68.0 80.0 75 2019.0 2 Pune

sp.dropna(axis=1)

Out[10]:
math score placement score placement count

0 66 97 3

1 91 76 22 72 79 23

99 84 24 62 87 3

5 69 50 16 77 93 3

7 62 96 3

8 75 77 29 79 85 3

10 61 80 2

11 61 80 2

12 78 90 3

13 66 98 3 14 80 95 3 15 72 84 2 16 80 79

17 60 85 3 18 80 89 3 19

72 76 2 20 67 91 3 21

69 77 2 22 64 75 2

23 74 100 3

24 77 84 2

25 75 86 3

26 65 96 3 27 64 95 3 28 69 75 2

new_data = sp.dropna(axis=0 , how ='any')


new_data

Out[11]:
gender scoremath reading score writing score placementscore club yearjoin
placementcount region

0 Male 66 65.0 76.0 97 76 79 2020.0 3 Pune


In [11]:

1 Female 91 70.0 76.0 84 2019.0 2 Baramati

Female 72 66.0 75.0 87 93 96 2019.0 2 Satara


2
Male 99 75.0 67.0 85 98 2018.0 2 Karad
3
Female 62 65.0 68.0 95 84 79 2018.0 3 Mulshi
4 Female 77 100.0 72.0 91 77 2020.0 3 Karad

6 Male 62 80.0 74.0 100 2021.0 3 Pune


7
Female 79 64.0 101.0 84 2021.0 3 Baramati

9 Female 66 68.0 75.0 86 96 2020.0 3 Baramati


13 95 75
Male 80 78.0 75.0 2018.0 3 Pune
14 Female 72 68.0 72.0 2021.0 2 Mulshi

15 Male 80 66.0 73.0 2019.0 2 Baramati

Male 67 60.0 64.0 2021.0 3 Pune


16
Female 69 71.0 63.0 2019.0 2 Mulshi
20
Female 74 62.0 69.0 2021.0 3 Pune
21
Male 77 61.0 74.0 2021.0 2 Mulshi
23
Female 75 73.0 74.0 2021.0 3 Pune
24 65 2021.0 3
Male 64 64.0 64.0 2020.0 3 Mulshi
Male 69 2019.0 2
25 Male 68.0 65.0 Mulshi

68.0 80.0 Pune


26
27

28
In
[12]: print(np.where(sp['math score']>90))

(array([1, 3]),)
In
[13]: print(np.where(sp['reading score'] <25))

col = ['math score' , 'reading score','placement score']


sp.boxplot(col)

(array([], dtype=int64),)
[14]:
In [12]:

Out[14]: <Axes: >

[18]:

fig,axes = plt.subplots(figsize = (18,10))


axes.scatter(sp['placement score'] ,sp['placement count'])
plt.show()
In [13]:

In
[19]: from sklearn.preprocessing import
LabelEncoder le = LabelEncoder()

In [ ]:

You might also like