In statistics, the term "frequency" indicates the number of occurrences of a value in a given data sample. As a software meant for mathematical and scientific analysis, Pandas has many in-built methods to calculate frequency from a given sample.
Absolute Frequency It is same as just the frequency where the number of occurrences of a data element is calculated. In the below example, we simply count the number of times the name of a city is appearing in a given DataFrame and report it out as frequency.
Approach 1 − We use the pandas method named .value_counts.
Example
import pandas as pd # Create Data Frame data = ["Chandigarh","Hyderabad","Pune","Pune","Chandigarh","Pune"] # use the method .value_counts() df = pd.Series(data).value_counts() print(df)
Output
Running the above code gives us the following result −
Pune 3 Chandigarh 2 Hyderabad 1 dtype: int64
Approach 2 − We use the pandas method named .crosstab
Example
import pandas as pd data = ["Chandigarh","Hyderabad","Pune","Pune","Chandigarh","Pune"] df = pd.DataFrame(data,columns=["City"]) tab_result = pd.crosstab(index=df["City"],columns=["count"]) print(tab_result)
Output
Running the above code gives us the following result −
col_0 count City Chandigarh 2 Hyderabad 1 Pune 3
RelativeFrequency − This is a fraction between a given frequency and the total number of observations in a data sample. So the value can be a floating point value which can also be expressed as a percentage. To find it out we first calculate the frequency as shown in the first approach and then divide it with total number of observations which is found out using the len() function.
Example
import pandas as pd # Create Data Frame data = ["Chandigarh","Hyderabad","Pune","Pune","Chandigarh","Pune"] # use the method .value_counts() df = pd.Series(data).value_counts() print(df/len(data))
Output
Running the above code gives us the following result −
Pune 0.500000 Chandigarh 0.333333 Hyderabad 0.166667 dtype: float64