0% found this document useful (0 votes)
12 views

HISTOGRAM

High-school notes best for study and revision

Uploaded by

BEAST TM 96
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

HISTOGRAM

High-school notes best for study and revision

Uploaded by

BEAST TM 96
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

HISTOGRAMS

It’s a bar chart showing FREQUENCY DISTRIBUTION.


In this case, the data is grouped into ranges, such as "100 to 199 ", " 200 to 300",
etc, and then plotted as bars based on the frequency values. The Range is also
called as the “Bins”.
The width of the bars show the bins and y axis shows the frequency.
It is Similar to a Bar Graph, but with a difference that, in a Histogram each bar is for
a range of data.
The width of the bars corresponds to the class intervals, while the height of each
bar corresponds to the frequency of the class it represents.
CONCEPT OF FREQUENCY DISTRIBUTION :
Let’s consider a test given to students out of 50 marks. Following are
the scores they get.

Test scores As per the scores lets see how many students
20 scored in different range of scores. Like,
30
20-25 3
45
32 26-30 1
34 31-35 2
24 36-40 0
25 41-45 1
48 46-50 4
50 This data is called the frequency distribution table.
50
49
To manually construct a histogram:
1. The first step is to “bin” the range of values, i.e., divide
the entire range of values into a series of intervals. These
bins may or may not be of same interval size.
2. Then count how many values fall into each interval.
NOTE: The bins are usually non-overlapping intervals of a
variable.
So the histogram of the previously
mentioned data looks like:
HOW TO DRAW HISTOGRAMS IN PYTHON???
Considering the above given data for marks, lets write the code to make the
histogram in python pandas.

Example 1:
import matplotlib.pyplot as plt
data=[20,30,45,32,34,24,25,48,50,50,49]
b=[20,26,31,36,41,46]

plt.hist(data, bins=b, color="green", label="marks")


plt.xlabel("student marks")
plt.ylabel("frequency")
plt.legend()
plt.show()
Example 2:
import matplotlib.pyplot as plt
data=[20,30,45,32,34,24,25,48,50,50,49]
b=[20,26,31,36,41,46]
plt.hist(data,bins=b,color="green", label="marks", edgecolor="black")
plt.xlabel("student marks")
plt.ylabel("frequency")
plt.legend()
plt.show()
Bin Frequency
Example of a Histogram with
0-33 >0 and <33 2
varying bin size 33-45 >=33 and <45 3
Example 3: 45-60 >=45 and <60 1
60-100 >=60 and <100 4
import matplotlib.pyplot as plt
data=[40,60,55,20,35,70,60,89,20,33]
bins=[0,33,45,60,100]
plt.hist(data,bins,color="green",
edgecolor="black")
plt.show()
What happens if we do not specify
the intervals or the bins to python?
• It will make 10 equal bins from the
data given to it automatically.

import matplotlib.pyplot as plt


data=[40,60,55,20,35,70,60,89,20,33]
plt.hist(data,color="green",edgecolor
="black")
plt.show()
What happens if we specify the
number of bins we want him to create?
import matplotlib.pyplot as plt
data=[40,60,55,20,35,70,60,89,20,33]
plt.hist(data,bins=5,color="green",
edgecolor="black")
plt.show()
Key points about BINS:
 It mentions the sequence of integers
 It is an optional argument to the hist().
 When not mentioned, python by default creates 10 bins of equal
range from the data given. It takes the lowest value and the
highest value from the data and divide the range into 10 equal
parts.
 If bins is (customized )mentioned as [11,15,20,30], then it will
have 3bins ( that is one less than the number of values
mentioned)
 As per the above example, the bins would be
11----15( including 11 but excluding 15)
15----20 (including 15 but excluding 20
20----30 (including both 20 and 30)
 We can also specify the number of bins we need by writing bins=n.
How to create gaps between the bars?
import matplotlib.pyplot as plt
data=[40,60,55,20,35,70,60,89,20,33]
plt.hist(data,bins=5,color="green",
edgecolor="black", rwidth=0.9)
plt.show()

By default: the value of rwidth is 1


Single array:
Import matplotlib.pyplot as plt
a=[1,4,7,12,13,15]
plt.hist(a)
Plt.show()
Multiple array:
Import matplotlib.pyplot as plt
a=[1,4,7,12,13,15]
b=[10,14,17,2,3,11]
plt.hist([a,b])
Plt.show()
Import matplotlib.pyplot as plt
a=[1,4,7,12,13,15]
plt.hist(a,bins=50)
Plt.show()
import matplotlib.pyplot as plt
a=[1,4,7,12,13,15]
plt.hist(a, bins=[1,5,10,15])
plt.show()
Cumulative means previous data added to next bar.
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
a=[1,4,7,12,13,15] a=[1,4,7,12,13,15]
plt.hist(a, bins=[1,5,10,15], cumulative =False) plt.hist(a, bins=[1,5,10,15], cumulative=True)
plt.show() plt.show()
Import matplotlib.pyplot as plt
a=[1,4,7,12,13,15]
b=[11,14,17,2,3,12]
plt.hist([a,b], bins=[1,5,10,15], histtype=‘barstacked’)
Plt.show()
Import matplotlib.pyplot as plt
a=[1,4,7,12,13,15]
b=[11,14,17,2,3,12]
plt.hist([a,b], bins=[1,5,10,15], histtype=‘bar’)
Plt.show()
Import matplotlib.pyplot as plt
a=[1,4,7,12,13,15]
b=[11,14,17,2,3,12]
plt.hist([a,b], bins=[1,5,10,15], histtype=‘step’)
Plt.show()
Import matplotlib.pyplot as plt
a=[1,4,7,12,13,15]
b=[11,14,17,2,3,12]
plt.hist([a,b], bins=[1,5,10,15], histtype=‘stepfilled’)
Plt.show()

You might also like