graphs using matplotlib
graphs using matplotlib
79 rows × 6 columns
Scatter plot
A scatter plot is a mathematical diagram using Cartesian coordinates to display values for two variables for
a set of data. The data are displayed as a collection of points, each having the value of one variable
determining the position on the horizontal axis and the value of the other variable determining the position
on the vertical axis. The points that are far from the population can be termed as an outlier.
x_coords = [1,2,3,4,5]
y_coords = [1,2,3,4,5]
fig = plt.figure(figsize=(6,5))
plt.scatter(x_coords, y_coords, marker='s')
plt.show()
plt.hist(tab['total'])
plt.xlabel('total_ marks')
plt.ylabel('frequency')
plt.show()
In the context of a histogram, "bins" refer to the intervals into which the entire range of data is divided. Each
bin represents a range of values, and the height (or length) of each bar in the histogram indicates how many
data points fall within that bin's range.
Line graph
Line charts are used to represent the relation between two data X and Y on a different axis
x = [5,8,10]
y = [12,16,6]
x2 = [6,9,11]
y2 = [6,15,7]
plt.plot(x,y,label='line one', linewidth=5, color='r')
plt.plot(x2,y2,label='line two',linewidth=5, color='g')
plt.title('Epic Info')
plt.ylabel('Y axis')
plt.xlabel('X axis')
plt.legend()
plt.grid(True)
plt.show()
In [8]: import matplotlib.pyplot as plt
# Sample data
x = [1, 2, 3, 4, 5]
y = [10, 20, 25, 30, 40]
# Create a plot
plt.plot(x, y, 'bo-')
[0. 0.2 0.4 0.6 0.8 1. 1.2 1.4 1.6 1.8 2. 2.2 2.4 2.6 2.8 3. 3.2 3.4
3.6 3.8 4. 4.2 4.4 4.6 4.8]
In [13]: t**1
array([0. , 0.2, 0.4, 0.6, 0.8, 1. , 1.2, 1.4, 1.6, 1.8, 2. , 2.2, 2.4,
Out[13]:
2.6, 2.8, 3. , 3.2, 3.4, 3.6, 3.8, 4. , 4.2, 4.4, 4.6, 4.8])
In [14]: t**2
In [15]: results=t**3
Bar Graph
In [16]: # bar plot
# Sample data
categories = ['Category A', 'Category B', 'Category C', 'Category D']
values = [23, 45, 56, 78]
df = pd.read_csv(r'C:\Users\lenovo\Downloads\wine-quality-white-and-red.csv')
In [20]: df
0 white 7.0 0.270 0.36 20.7 0.045 45.0 170.0 1.00100 3.00 0.45 8.8
1 white 6.3 0.300 0.34 1.6 0.049 14.0 132.0 0.99400 3.30 0.49 9.5
2 white 8.1 0.280 0.40 6.9 0.050 30.0 97.0 0.99510 3.26 0.44 10.1
3 white 7.2 0.230 0.32 8.5 0.058 47.0 186.0 0.99560 3.19 0.40 9.9
4 white 7.2 0.230 0.32 8.5 0.058 47.0 186.0 0.99560 3.19 0.40 9.9
... ... ... ... ... ... ... ... ... ... ... ... ...
6492 red 6.2 0.600 0.08 2.0 0.090 32.0 44.0 0.99490 3.45 0.58 10.5
6493 red 5.9 0.550 0.10 2.2 0.062 39.0 51.0 0.99512 3.52 0.76 11.2
6494 red 6.3 0.510 0.13 2.3 0.076 29.0 40.0 0.99574 3.42 0.75 11.0
6495 red 5.9 0.645 0.12 2.0 0.075 32.0 44.0 0.99547 3.57 0.71 10.2
6496 red 6.0 0.310 0.47 3.6 0.067 18.0 42.0 0.99549 3.39 0.66 11.0
Create a histogram
In [21]: # create histogram
bin_edges = np.arange(0, df['residual sugar'].max() + 1, 10)
df['residual sugar'].max()+1
df['residual sugar']
0 20.7
Out[21]:
1 1.6
2 6.9
3 8.5
4 8.5
...
6492 2.0
6493 2.2
6494 2.3
6495 2.0
6496 3.6
Name: residual sugar, Length: 6497, dtype: float64
In [22]: bin_edges
Pair plot
A scatterplot matrix is a grid of scatterplots that allows us to see how different pairs of variables are related
to each other. You can use the hue parameter when creating pairplots in seaborn to color plot aspects
based on the values of a specific variable
This chart type helps in revealing the distribution of the data along a numeric variable, highlighting the
density and variation of the data more effectively than traditional scatter plots or box plots.
In [27]: df['quality'].unique()
Violin Plots
A violin plot is a hybrid of a box plot and a kernel density plot, which shows peaks in the data. It is used to
visualize the distribution of numerical data. Unlike a box plot that can only show summary statistics, violin
plots depict summary statistics and the density of each variable.
<Axes: >
Out[29]:
Subplots
The layout is organized in rows and columns, which are represented by the first and second argument. The
third argument represents the index of the current plot.
In [30]: x = range(11)
y = range(11)
plt.show()
In [31]: import matplotlib.pyplot as plt
import numpy as np
# Data to plot
labels = ['Apples', 'Bananas', 'Cherries', 'Dates']
sizes = [15, 30, 45, 10] # Sizes represent the proportions
colors = ['red', 'yellow', 'pink', 'brown'] # Colors for each slice
# Add a title
plt.title('Fruit Pie Chart')
plt.pie(
(7,3),
labels=('spam','ham'),
shadow=True,
colors=('yellowgreen', 'lightskyblue'),
explode=(0,0.15), # space between slices
startangle=90, # rotate conter-clockwise by 90 degrees
autopct='%1.1f%%',# display fraction as percentage
)
plt.legend()
plt.axis('equal') # plot pyplot as circle
plt.show()
In [34]: import matplotlib.pyplot as plt
# Data to plot
labels = ['A', 'B', 'C', 'D']
sizes = [15, 30, 45, 10]
colors = ['gold', 'yellowgreen', 'lightcoral', 'lightskyblue']
wedgeprops = {'edgecolor': 'black', 'linewidth': 1}
textprops = {'fontsize': 12, 'color': 'blue'}
plt.hist(x1, bins=bins)
plt.hist(x2, bins=bins)
plt.show()
Assignment
Make the below graph, using the attached csv file in your mail (model_data.csv). No alteration to the .csv
should be doen.
img = mpimg.imread('C:/Users/lenovo/Downloads/mymodel.png')
plt.imshow(img)
plt.axis('off') # Hide axes if not needed
plt.show()
In [ ]: