PR 6
PR 6
Roll No : COT53
DSBDA Practical 6
In [2]:
data = pd.read_csv("https://raw.githubusercontent.com/venky14/Machine-Learning-
In [3]:
Out[3]:
data
Iris-
0 1 5.1 3.5 1.4 0.2
setosa
Irissetosa
2
1 4.9 3.0 1.4 0.2
Irissetosa
3
2 4.7 3.2 1.3 0.2
Irissetosa
3 4 4.6 3.1 1.5 0.2
Irissetosa
4 5 5.0 3.6 1.4 0.2
Irisvirginica
145 146 6.7 3.0 5.2 2.3
Irisvirginica
Irisvirginica
147 148 6.5 3.0 5.2 2.0
Irisvirginica
148 149 6.2 3.4 5.4 2.3
In [4]: data.head(5)
In [5]: data.tail()
Iris-
145 146 6.7 3.0 5.2 2.3
virginica
Irisvirginica
Irisvirginica
149 150 5.9 3.0 5.1 1.8
In [6]:
uniqu Na Na
NaN NaN NaN
e N N
to Na Na NaN
p N NaN N NaN
fre Na Na
NaN NaN NaN
q N N
mi 4.3000 1.0000
1.000000 2.000000 0.100000
n 00 00
se
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149 Data
columns (total 6 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Id 150 non-null int64
1 SepalLengthCm 150 non-null float64
2 SepalWidthCm 150 non-null float64
3 PetalLengthCm 150 non-null float64
4 PetalWidthCm 150 non-null float64 5 Species 150 non-null
object dtypes: float64(4), int64(1), object(1) memory usage: 7.2+ KB
(150, 6)
Out[8]: array(['Iris-setosa', 'Iris-versicolor', 'Iris-virginica'], dtype=object) In
[9]: data.isnull().sum()
Out[9]: Id 0 SepalLengthCm
0
SepalWidthCm 0
PetalLengthCm 0
PetalWidthCm 0
Species 0 dtype:
int64
x = data.iloc[:,1:5]
y = data.iloc[:,5:]
In [10]:
In [11]:
encode = LabelEncoder() y =
encode.fit_transform(y)
C:\Users\coeco\anaconda3\lib\site-packages\sklearn\preprocessing\_label.py:115:
D ataConversionWarning: A column-vector y was passed when a 1d array was
expected.
Please change the shape of y to (n_samples, ), for example using ravel(). y =
column_or_1d(y, warn=True)
In [14]: pred
Out[14]: array([2, 1, 0, 2, 0, 2, 0, 1, 1, 1, 2, 1, 1, 1, 1, 0, 1, 1, 0, 0, 2, 1,
0, 0, 2, 0, 0, 1, 1, 0, 2, 1, 0, 2, 2, 1, 0, 1, 1, 1, 2, 0, 2, 0,
0])
In [15]: y_test
0])
In [16]:
[[16 0 0]
[ 0 18 0]
[ 0 0 11]]
In [18]: print(classification_report(y_test,pred))
accuracy 1.00 45
Accuracy: 1.00
Error Rate: 0.0
Sensitivity (Recall or True positive rate) : 1.0
Specificity (True negative rate) : 1.0
Precision (Positive predictive value) : 1.0
False Positive Rate : 0.0
In [ ]:
In [ ]: