Notebook Visualizing Data Book
Notebook Visualizing Data Book
1 Visualizing Data
A fundamental part of the data scientist’s toolkit is data visualization. Although it is very easy to
create visualizations, it’s much harder to produce good ones.
1.1 matplotlib
A wide variety of tools exists for visualizing data. We will be using the matplotlib library, which is
widely used. Matplotlib is a Python 2D plotting library which produces publication quality figures
in a variety of hardcopy formats and interactive environments across platforms
Checking Matplotlib Version
[ ]: import matplotlib
print(matplotlib.__version__)
3.4.2
1
Changing the Label Type and Graph Thickness
[ ]: # Changing the Label Type and Graph Thickness
import matplotlib.pyplot as plt
plt.plot(squares, linewidth=3)
plt.xlabel("Value", fontsize=14)
plt.ylabel("Square of Value", fontsize=14)
2
Correcting the Plot
[ ]: # Correcting the Plot
import matplotlib.pyplot as plt
input_values = [1, 2, 3, 4, 5]
squares = [1, 4, 9, 16, 25]
3
Markers
[ ]: # Markes
import matplotlib.pyplot as plt
input_values = [1, 2, 3, 4, 5]
squares = [1, 4, 9, 16, 25]
4
Linestyle
You can use the keyword argument linestyle, or shorter ls, to change the style of the plotted line:
[ ]: # Linestyle
import matplotlib.pyplot as plt
5
• linestyle can be written as = ls
• dotted can be written as = :
• dashed can be written as = –
Style Or
‘solid’ (default) ‘-’
‘dotted’ ‘:’
‘dashed’ ‘–’
‘dashdot’ ‘-.’
Line Color
You can use the keyword argument color or the shorter c to set the color of the line:
[ ]: # refer above cell
plt.plot(squares, c = 'green')
# you can use c instead of color
plt.show()
6
Multiple Lines
You can plot as many lines as you like by simply adding more plt.plot() functions:
y1 = [3, 8, 1, 10]
y2 = [6, 2, 7, 11]
y3 = [5, 1, 12, 13]
plt.plot(y1, c = 'r')
plt.plot(y2, c = 'b')
plt.plot(y3)
plt.show()
7
Grid Lines
With Pyplot, you can use the grid() function to add grid lines to the plot.
plt.show()
8
TRY IT YOURSELF
You have to change the attributes of grid
Matplotlib Subplots
Display Multiple Plots With the subplots() function you can draw multiple plots in one figure:
#plot 1:
x = [0, 1, 2, 3]
y = [3, 8, 1, 10]
plt.subplot(1, 3, 2)
plt.title("First")
plt.plot(x,y)
#plot 2:
x = [0, 1, 2, 3]
y = [10, 20, 30, 40]
plt.subplot(1, 3, 3)
plt.title("Second")
plt.plot(x,y)
9
#plot 3:
x = [4, 3, 2, 1]
y = [16, 9, 4, 1]
plt.subplot(1, 3, 1)
plt.title("THIRD")
plt.plot(x,y)
plt.suptitle("Two plots")
plt.show()
TRY IT YOURSELF
Subplot the two or more graphs horizontally
10
[ ]: import matplotlib.pyplot as plt
plt.show()
11
[ ]: import matplotlib.pyplot as plt
x_values = [1, 2, 3, 4, 5]
y_values = [1, 4, 9, 16, 25]
plt.scatter(x_values, y_values)
plt.show()
12
Calculating Data Automatically
[ ]: import matplotlib.pyplot as plt
13
[ ]: # Compare Plots
import matplotlib.pyplot as plt
plt.show()
14
[ ]: import matplotlib.pyplot as plt
plt.show()
15
Using a Colormap
[ ]: import matplotlib.pyplot as plt
x = [5,7,8,7,2,17,2,9,4,11,12,9,6]
y = [99,86,87,88,111,86,103,87,94,78,77,85,86]
colors = [0, 10, 20, 30, 40, 45, 50, 55, 60, 70, 80, 90, 100]
TRY IT YOURSELF
Cubes: A number raised to the third power is a cube. Plot the first five cubic numbers, and then
plot the first 5000 cubic numbers. Using scatter plot with colormap
1.1.3 Bars
With Pyplot, you can use the bar() function to draw bar graphs:
plt.bar(x, y, color='g')
16
# You can use plt.yticks
#plt.yticks(range(0, 101, 10))
plt.show()
[ ]: plt.barh(x, y)
plt.show()
1.1.4 Legends
Plot legends give meaning to a visualization.
[50]: import matplotlib.pyplot as plt
x = [x for x in range(0,11)]
y = [2*x for x in x]
plt.figure(dpi=150)
plt.plot(x, y, c='b', marker='^', label = '$2x$')
y2 = [x**2 for x in x]
plt.xlabel('X Axis')
plt.ylabel('Y Axis')
plt.xticks(range(0,11))
plt.yticks(range(0,101, 10))
plt.grid()
plt.title("Legend Example")
plt.legend(fancybox=True, framealpha=True, shadow=True, borderpad=1)
plt.show()
17
1.1.5 Pie Chart
[3]: import matplotlib.pyplot as plt
#plt.figure(dpi=100)
plt.title("Fruits", loc="left")
#plt.legend(title='Fruits',loc='upper left')
plt.show()
18
19