0% found this document useful (0 votes)
11 views1 page

Histogram

Uploaded by

Jeeva Harshi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views1 page

Histogram

Uploaded by

Jeeva Harshi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 1

Histogram

A histogram is a common graphical representation of the distribution of a quantitative


histogram feature. We start by breaking the range of the values into a number of bins or classes.

We tally the counts of the values falling in each bin and then make the plot by drawing
rectangles whose bases are the bin intervals and whose heights are the counts.

In Python we can use the function plt.hist. For example, Figure 1.3 shows a histogram of the
226 ages in nutri, constructed via the following Python code. weights = np. ones_like
(nutri.age)/nutri.age.count () plt.hist(nutri.age ,bins =9, weights=weights , facecolor ='cyan',
edgecolor ='black', linewidth =1) plt.xlabel('age') plt.ylabel('Proportion of Total') plt.show ()

Importing, Summarizing, and Visualizing Data 11 Here 9 bins were used. Rather than using
raw counts (the default), the vertical axis here gives the percentage in each class, defined by count
total . This is achieved by choosing the “weights” parameter to be equal to the vector with entries
1/266, with length 226.

Various plotting parameters have also been changed. 65 70 75 80 85 90 age 0.00 0.05 0.10
0.15 0.20 Proportion of Total Figure 1.3:

Histogram of 'age'. Histograms can also be used for discrete features, although it may be
necessary to explicitly specify the bins and placement of the ticks on the axes. 1.5.2.3 Empirical
Cumulative Distribution Function The empirical cumulative distribution function, denoted by Fn, is a
step function which empirical cumulative distribution function jumps an amount k/n at observation

the fraction of observations less than or equal to x, i.e., Fn(x) = number of xi ⩽ x n = 1 n Xn i=1 1 {xi
values, where k is the number of tied observations at that value. For observations x1, . . . , xn, Fn(x) is

⩽ x} , (1.2) where 1 denotes the indicator function; that is, 1 {xi ⩽ x} is equal to 1 when xi ⩽ x and 0
indicator otherwise.

To produce a plot of the empirical cumulative distribution function we can use the plt.step
function. The result for the age data is shown in Figure 1.4. The empirical cumulative distribution
function for a discrete quantitative variable is obtained in the same way. x = np.sort(nutri.age) y =
np.linspace (0,1,len(nutri.age)) plt.xlabel('age') plt.ylabel('Fn(x)') plt.step(x,y) plt.xlim(x.min(),x.max())

You might also like