Assignment 5'
Assignment 5'
MIS: 642310018
BRANCH: MECHANICAL
BATCH: H
ASSIGNMENT NO. 5
Use Scikit-learn to print the keys, number of rows-columns, feature names
and the description of the Iris data.
INPUT:
from sklearn.datasets import load_iris
iris = load_iris()
print("Keys of the dataset:\n", iris.keys())
print("\nNumber of rows and columns:\n", iris.data.shape)
print("\nFeature names:", iris.feature_names)
print("\nDataset description:\n", iris.DESCR)
OUTPUT:
Keys of the dataset:
dict_keys(['data', 'target', 'frame', 'target_names', 'DESCR', 'feature_names', 'filename', 'd
ata_module'])
Feature names: ['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (
cm)']
Dataset description:
.. _iris_dataset:
:Summary Statistics:
The famous Iris database, first used by Sir R.A. Fisher. The dataset is taken
from Fisher's paper. Note that it's the same as in R, but not as in the UCI
Machine Learning Repository, which has two wrong data points.
|details-start|
**References**
|details-split|
OUTPUT:
sepal length (cm) sepal width (cm) petal length (cm) \
count 150.000000 150.000000 150.000000
mean 5.843333 3.057333 3.758000
std 0.828066 0.435866 1.765298
min 4.300000 2.000000 1.000000
25% 5.100000 2.800000 1.600000
50% 5.800000 3.000000 4.350000
75% 6.400000 3.300000 5.100000
max 7.900000 4.400000 6.900000
INPUT:
import numpy as np
from scipy.sparse import csr_matrix
dense_matrix=np.eye(5)
sparse_matrix = csr_matrix(dense_matrix)
print(sparse_matrix)
OUTPUT:
(0, 0) 1.0
(1, 1) 1.0
(2, 2) 1.0
(3, 3) 1.0
(4, 4) 1.0
INPUT:
df=pd.read_csv('iris.csv')
print(df.head())
df_modified=df.drop(columns=['sepal.length'], index=2)
print("\nModified DataFrame(without'sepal.length'columns and row 2)")
print(df_modified.head())
OUTPUT:
sepal.length sepal.width petal.length petal.width variety
0 5.1 3.5 1.4 0.2 Setosa
1 4.9 3.0 1.4 0.2 Setosa
2 4.7 3.2 1.3 0.2 Setosa
3 4.6 3.1 1.5 0.2 Setosa
4 5.0 3.6 1.4 0.2 Setosa