0% found this document useful (0 votes)
19 views7 pages

Gradrate Histogram

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views7 pages

Gradrate Histogram

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

gradrate_histogram

September 6, 2023

1 Examples 1.4 of the textbook - Histogram


1.1 First let us import the necessary Python tools/libraries
[1]: import pandas as pd # for data manipulation
import numpy as np # for math operations
import matplotlib.pyplot as plt # for plotting
import seaborn as sns # for plotting

1.2 Now let us import the data (.csv file must be in the same folder as the
.pynb file)
[2]: gradrate = pd.read_csv("eg01-05gradrate.csv")
gradrate.head()

[2]: STATE PCTGRAD REGION


0 ALABAMA 89.3 S
1 ALASKA 78.2 W
2 ARIZONA 78.0 W
3 ARKANSAS 88.0 S
4 CALIFORNIA 82.7 W

Let us check how many data points each dataset contains


[3]: gradrate.shape[0] #number of rows (50 states + district of Columbia)

[3]: 51

[4]: gradrate.shape[1] #number of columns

[4]: 3

[5]: gradrate.shape

[5]: (51, 3)

1
1.3 Histogram of the percentage of on-time HS graduates in the US
[6]: plt.hist(gradrate["PCTGRAD"])
plt.show()

[7]: plt.hist(gradrate["PCTGRAD"],bins=9)
plt.show()

2
[8]: #help(plt.hist)

[9]: plt.hist(gradrate["PCTGRAD"],bins=[70,72.5,75,77.5,80,82.5,85,87.5,90,92.5])
plt.show()

[10]: plt.hist(gradrate["PCTGRAD"],bins=[70,72.5,75,77.5,80,82.5,85,87.5,90,92.5])
plt.ylabel("Number of States")
plt.xlabel("Percentage of on-time HS graduates")

[10]: Text(0.5, 0, 'Percentage of on-time HS graduates')

3
1.4 Using the library seaborn: sns.histplot: https://seaborn.pydata.org/generated/seaborn.h
[11]: import seaborn as sns

[12]: sns.histplot(data=gradrate, x="PCTGRAD",bins=[70,72.5,75,77.5,80,82.5,85,87.


,→5,90,92.5]) # it looks much better!

plt.show()

4
[13]: #help(sns.histplot)

[14]: sns.histplot(data=gradrate, x="PCTGRAD",binwidth=2.5,binrange=[70,92.5])


plt.ylabel("Number of States")
plt.xlabel("Percentage of on-time HS graduates")
plt.show()

5
[15]: p = sns.histplot(data=gradrate, x="PCTGRAD",binwidth=2.5,binrange=[70,92.5])
p.set(xlabel="Percentage of on-time HS graduates",
ylabel="Number of States",
title='Histogram')
plt.show()

6
7

You might also like