0% found this document useful (0 votes)
100 views12 pages

DATA VISUALIZATION - Part 4

- A histogram is a bar chart showing frequency distribution, where the data is grouped into ranges called bins and plotted as bars based on frequency. - The document demonstrates how to construct a histogram manually by binning data ranges and counting frequencies, and provides examples of drawing histograms in Python using Pandas by specifying data, bins, colors and other parameters. - Key points discussed include how Python automatically creates bins if not specified, and how the number and values of bins can be customized.

Uploaded by

Adithya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
100 views12 pages

DATA VISUALIZATION - Part 4

- A histogram is a bar chart showing frequency distribution, where the data is grouped into ranges called bins and plotted as bars based on frequency. - The document demonstrates how to construct a histogram manually by binning data ranges and counting frequencies, and provides examples of drawing histograms in Python using Pandas by specifying data, bins, colors and other parameters. - Key points discussed include how Python automatically creates bins if not specified, and how the number and values of bins can be customized.

Uploaded by

Adithya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

DATA VISUALIZATION USING

PYTHON PANDAS
INFORMATICS PRACTICES
CLASS XII
HISTOGRAMS
It’s a bar chart showing FREQUENCY DISTRIBUTION.
In this case, the data is grouped into ranges, such as "100 to 199 ", " 200 to 300",
etc, and then plotted as bars based on the frequency values. The Range is also
called as the “Bins”.
The width of the bars show the bins and y axis shows the frequency.
It is Similar to a Bar Graph, but with a difference that, in a Histogram each bar is for
a range of data.
The width of the bars corresponds to the class intervals, while the height of each
bar corresponds to the frequency of the class it represents.
CONCEPT OF FREQUENCY DISTRIBUTION :
Let’s consider a test given to students out of 50 marks. Following are
the scores they get.

Test scores As per the scores lets see how many students
20 scored in different range of scores. Like,
30 20-25 3
45
32
26-30 1
34 31-35 2
24 36-40 0
25 41-45 1
48 46-50 4
50 This data is called the frequency distribution table.
50
49
To manually construct a histogram:
1. The first step is to “bin” the range of values, i.e., divide
the entire range of values into a series of intervals. These
bins may or may not be of same interval size.
2. Then count how many values fall into each interval.
NOTE: The bins are usually non-overlapping intervals of a
variable.
So the histogram of the previously mentioned
data looks like:
HOW TO DRAW HISTOGRAMS IN PYTHON???
Considering the above given data for marks, lets write the code to make the
histogram in python pandas.

Example 1:
import matplotlib.pyplot as plt
data=[20,30,45,32,34,24,25,48,50,50,49]
b=[20,26,31,36,41,46]

plt.hist(data, bins=b, color="green", label="marks")


plt.xlabel("student marks")
plt.ylabel("frequency")
plt.legend()
plt.show()
Example 2:
import matplotlib.pyplot as plt
data=[20,30,45,32,34,24,25,48,50,50,49]
b=[20,26,31,36,41,46]
plt.hist(data,bins=b,color="green", label="marks", edgecolor="black")
plt.xlabel("student marks")
plt.ylabel("frequency")
plt.legend()
plt.show()
Bin Frequency
Example of a Histogram with
0-33 >0 and <33 2
varying bin size 33-45 >=33 and <45 3
Example 3: 45-60 >=45 and <60 1
60-100 >=60 and <100 4
import matplotlib.pyplot as plt
data=[40,60,55,20,35,70,60,89,20,33]
bins=[0,33,45,60,100]
plt.hist(data,bins,color="green",
edgecolor="black")
plt.show()
What happens if we do not specify the
intervals or the bins to python?
• It will make 10 equal bins from the
data given to it automatically.
import matplotlib.pyplot as plt
data=[40,60,55,20,35,70,60,89,20,33]
plt.hist(data,color="green",edgecolor
="black")
plt.show()
What happens if we specify the number of
bins we want him to create?
import matplotlib.pyplot as plt
data=[40,60,55,20,35,70,60,89,20,33]
plt.hist(data,bins=5,color="green",
edgecolor="black")
plt.show()
Key points about BINS:
→It mentions the sequence of integers
→It is an optional argument to the hist().
→When not mentioned, python by default creates 10 bins of equal
range from the data given. It takes the lowest value and the highest
value from the data and divide the range into 10 equal parts.
→If bins is (customized )mentioned as [11,15,20,30], then it will have 3
bins ( that is one less than the number of values mentioned)
→ As per the above example, the bins would be
11----15( including 11 but excluding 15)
15----20 (including 15 but excluding 20
20----30 (including both 20 and 30)
→ We can also specify the number of bins we need by writing bins=n.
How to create gaps between the bars?
import matplotlib.pyplot as plt
data=[40,60,55,20,35,70,60,89,20,33]
plt.hist(data,bins=5,color="green",
edgecolor="black", rwidth=0.9)
plt.show()

By default: the value of rwidth is 1

You might also like