42 Histograms2
42 Histograms2
import pandas as pd
import matplotlib.pyplot as plt
data = {'Name':['Arnav', 'Sheela', 'Azhar', 'Bincy', 'Yash',
'Nazar'],
'Height' : [60,61,63,65,61,60],
'Weight' : [47,89,52,58,50,47]}
}
df=pd.DataFrame(data)
df.plot(kind='hist')
plt.show()
Creating a Histogram
To create a histogram the first step is to create bin of the ranges, then
distribute the whole range of the values into a series of intervals, and
count the values which fall into each of the intervals.Bins are clearly
identified as consecutive, non-overlapping intervals of variables.The
matplotlib.pyplot.hist() function is used to compute and create histogram
of x.
The following table shows the parameters accepted by
matplotlib.pyplot.hist() function :In Matplotlib, we use the hist() function to
create histograms.The hist() function will use an array of numbers to create a
histogram, the array is sent into the function as an argument.
Attribute parameter
x array or sequence of array
bins optional parameter contains integer or sequence or strings
density optional parameter contains boolean values
range optional parameter represents upper and lower range of bins
optional parameter used to create type of histogram [bar, barstacked, step,
histtype
stepfilled], default is “bar”
align optional parameter controls the plotting of histogram [left, right, mid]
weights optional parameter contains array of weights having same dimensions as x
bottom location of the basline of each bin
rwidth optional parameter which is relative width of the bars with respect to bin width
color optional parameter used to set color or sequence of color specs
label optional parameter string or sequence of string to match with multiple datasets
log optional parameter used to set histogram axis on log scale
Example1:
x array or sequence of array
# Creating dataset
a = np.array([22, 87, 5, 43, 56,73, 55, 54, 11,20, 51, 5, 79, 31,27])
# Creating histogram
plt.hist(a,
# Show plot
plt.xlabel('age')
plt.ylabel('count')
plt.title('Histogram Example')
plt.show()
Example2:
# Creating dataset
a = np.array([22, 87, 5, 43, 56,73, 55, 54, 11,20, 51, 5, 79, 31,27])
# Creating histogram
# bins{int,sequence.string)
plt.hist(a,bins=5,ec=’red’) #binn=int , ec=edge colour
plt.hist(a,bins=[0,25,50,75,100],ec=’red’) #binn=sequence , ec=edgecolour
plt.hist(a,bins=[0,25,50,75,100],ec=’red’) #binn=string , ec=edgecolour
# Show plot
plt.xlabel('age')
plt.ylabel('count')
plt.title('Histogram Example')
plt.show()
Example3:
# Creating dataset
a = np.array([22, 87, 5, 43, 56,73, 55, 54, 11,20, 51, 5, 79, 31,27])
# Creating histogram
# bins{int,sequence.string)
plt.hist(a,bins=5,ec=’red’) #binn=int , ec=edge colour
# Show plot
plt.xlabel('age')
plt.ylabel('count')
plt.title('Histogram Example')
plt.show()
ex:
plt.hist(a,3,(5,90),ec='red')
Example4:
density =’True’
# Creating dataset
a = np.array([22, 87, 5, 43, 56,73, 55, 54, 11,20, 51, 5, 79, 31,27])
# Creating histogram
# bins{int,sequence.string)
plt.hist(a,bins=5,ec=’red’, density=’True’) #binn=int , ec=edge colour
# Show plot
plt.xlabel('age')
plt.ylabel('count')
plt.title('Histogram Example')
plt.show()
Example5:
import numpy as np
# Creating dataset
a = np.array([1,12,22,21,20,21])
# Creating histogram
# bins{int,sequence.string)
plt.hist(a,bins=5,ec='red',weights=[2,2,2,2,3,3])
# Show plot
plt.xlabel('age')
plt.ylabel('count')
plt.title('Histogram Example')
plt.show()
Example:6
Cumulative=’True’
bincount+smaller values
Cumulative=’-1’
bincount+Greater values
# Creating dataset
a = np.array([1,12,22,21,20,21])
# Creating histogram
plt.hist(a,bins=5,ec='red',cumulative='True')
#plt.hist(a,bins=5,ec='red',cumulative='-1')
# Show plot
plt.xlabel('age')
plt.ylabel('count')
plt.title('Histogram Example')
plt.show()
Customising Histogram:
Taking the same data as above, now let see how the histogram can be customised. Let
us change the edgecolor, which is the border of each hist, to green.Also, let us change
the line style to ":" and line width to 2. Let us try another property called fill, which
takes boolean values. The default True means each hist will be filled with color and
False means each hist will be empty. Another property called hatch can be used to fill
to each hist with pattern ( '-', '+', 'x', '\\', '*', 'o', 'O', '.'). In the Program 4-10, we have
used the hatch value as "o".
mport pandas as pd
import matplotlib.pyplot as plt
df=pd.DataFrame(data)
df.plot(kind='hist',edgecolor='Green',linewidth=2,linestyle=':',fill=False,hatch='o')
plt.show()