0% found this document useful (0 votes)
36 views

Python (Visualization)

This document discusses importing and visualizing the Iris dataset using Python. It first imports the Iris dataset from scikit-learn and displays information about the feature names, target classes, and attribute values. It then imports matplotlib for visualization. Scatter plots are generated to visualize relationships between attribute pairs like petal length and width, with data points colored by target class. Other attribute pairs like sepal length and width can also be visualized in this way.

Uploaded by

Oscar Wong
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views

Python (Visualization)

This document discusses importing and visualizing the Iris dataset using Python. It first imports the Iris dataset from scikit-learn and displays information about the feature names, target classes, and attribute values. It then imports matplotlib for visualization. Scatter plots are generated to visualize relationships between attribute pairs like petal length and width, with data points colored by target class. Other attribute pairs like sepal length and width can also be visualized in this way.

Uploaded by

Oscar Wong
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Python (Data Visualization)

1. Data Import
We can import the Iris dataset from the Python package scikit-learn.
Detailed information about scikit-learn can be found at scikit-learn.org.
from sklearn import datasets
iris = datasets.load_iris()
What does the Iris dataset look like?
iris.feature_names
Result: ['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']
To display the names of the target classes:
iris.target_names
Result: array(['setosa', 'versicolor', 'virginica'], dtype='<U10')
To display the attribute values of the records:
iris.data
Result: array([ 5.1, 3.5, 1.4, 0.2], [ 4.9, 3. , 1.4, 0.2], [ 4.7, 3.2, 1.3, 0.2], [ 4.6, 3.1, 1.5, 0.2],
[ 5. , 3.6, 1.4, 0.2] …… )
To display the target outputs of the records:
iris.target
Result: array ([0, 0, 0,…,1, 1, 1 ,…,2, 2, 2,…])
The classes ‘setosa’, ‘versicolor’ and ‘virginica’ are denoted by 0, 1, and 2, respectively.

2. Data Visualization
In this section, we use the package matplotlib to visualize data.
Detailed information about matplotlib can be found at matplotlib.org.
Package setup for visualization:
import matplotlib.pyplot as plt
We use a subset of attributes in the Iris dataset for visualization. First, we select the attributes
“Petal length” and “Petal width” as follows.
X = iris.data[:, 2:4]
t = iris.target
We can now generate a scatter plot using the attribute values in X, and use the target outputs to
distinguish the instances.
plt.scatter(X[:, 0], X[:, 1], c=t)
plt.xlabel('Petal length')
plt.ylabel('Petal width')
plt.show()

You can generate the scatter plot for other pairs of attributes. For example, the attribute pair
(Sepal length, Sepal width) can be specified as follows:
X = iris.data[:, :2]
Accordingly, labels for the two axes should also be changed:
plt.scatter(X[:, 0], X[:, 1], c=t)
plt.xlabel('Sepal length')
plt.ylabel('Sepal width')
plt.show()

You might also like