0% found this document useful (0 votes)
5 views2 pages

Experiment 1

The document outlines a Python script that loads the Iris dataset for supervised learning, splits it into training and testing sets, and creates a summary DataFrame for exploration. It provides basic statistics of the dataset and visualizes the data using a pairplot. The summary includes key statistics such as mean, standard deviation, and range for each feature in the dataset.

Uploaded by

Rishab
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views2 pages

Experiment 1

The document outlines a Python script that loads the Iris dataset for supervised learning, splits it into training and testing sets, and creates a summary DataFrame for exploration. It provides basic statistics of the dataset and visualizes the data using a pairplot. The summary includes key statistics such as mean, standard deviation, and range for each feature in the dataset.

Uploaded by

Rishab
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

# Import necessary libraries

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.model_selection import train_test_split

# Load a dataset suitable for supervised learning (Iris dataset)


iris = datasets.load_iris()
X = iris.data
y = iris.target

# Split the dataset into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.3, random_state=42)

# Create a summary DataFrame for exploration


summary = pd.DataFrame(X, columns=iris.feature_names)
summary['target'] = y

# Display basic statistics


print("Dataset Summary:\n", summary.describe(include='all'))

# Plot a pairplot using seaborn for visualization (optional)


import seaborn as sns
sns.pairplot(summary, hue='target')
plt.suptitle("Iris Dataset Pairplot", y=1.02)
plt.show()

Dataset Summary:
sepal length (cm) sepal width (cm) petal length (cm) \
count 150.000000 150.000000 150.000000
mean 5.843333 3.057333 3.758000
std 0.828066 0.435866 1.765298
min 4.300000 2.000000 1.000000
25% 5.100000 2.800000 1.600000
50% 5.800000 3.000000 4.350000
75% 6.400000 3.300000 5.100000
max 7.900000 4.400000 6.900000

petal width (cm) target


count 150.000000 150.000000
mean 1.199333 1.000000
std 0.762238 0.819232
min 0.100000 0.000000
25% 0.300000 0.000000
50% 1.300000 1.000000
75% 1.800000 2.000000
max 2.500000 2.000000

You might also like