
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Create Relative Frequency Table Using dplyr in R
The relative frequency is the proportion of something out of total. For example, if we have 5 bananas, 6 guava, 10 pomegranates then the relative frequency of banana would be 5 divided by the total sum of 5, 6, and 10 that is 21 hence it can be also called proportional frequency.
Example1
Consider the below data frame −
set.seed(21) x<−sample(LETTERS[1:4],20,replace=TRUE) Ratings<−sample(1:50,20) df1<−data.frame(x,Ratings) df1
Output
x Ratings 1 C 44 2 A 29 3 C 14 4 A 10 5 B 46 6 C 1 7 D 47 8 A 8 9 C 23 10 C 7 11 D 50 12 B 31 13 B 34 14 B 3 15 D 48 16 B 33 17 C 45 18 B 9 19 B 40 20 C 21
Loading dplyr package −
library(dplyr)
Finding the table of relative frequencies of values in x −
df1%>%group_by(x)%>%summarise(n=n())%>%mutate(freq=n/sum(n)) `summarise()` ungrouping output (override with `.groups` argument) # A tibble: 4 x 3
Output
x n freq <chr> <int> <dbl> 1 A 3 0.15 2 B 7 0.35 3 C 7 0.35 4 D 3 0.15 Warning message: `...` is not empty. We detected these problematic arguments: * `needs_dots` These dots only exist to allow future extensions and should be empty. Did you misspecify an argument?
Note − Do not worry about this warning message because our problem is correctly solved and the warning has no relation with that.
Example2
y<−sample(c("Male","Female"),20,replace=TRUE) Salary<−sample(20000:50000,20) df2<−data.frame(y,Salary) df2
Output
y Salary 1 Female 40907 2 Female 47697 3 Male 49419 4 Female 23818 5 Male 21585 6 Male 22276 7 Female 21856 8 Male 22092 9 Male 27892 10 Female 47655 11 Male 34933 12 Female 48027 13 Female 48179 14 Male 21460 15 Male 24233 16 Female 43762 17 Female 22369 18 Female 47206 19 Male 34972 20 Female 30222
Finding the relative frequencies of genders in y −
df2%>%group_by(y)%>%summarise(n=n())%>%mutate(freq=n/sum(n)) `summarise()` ungrouping output (override with `.groups` argument) # A tibble: 2 x 3
Output
y n freq <chr> <int> <dbl> 1 Female 11 0.55 2 Male 9 0.45 Warning message: `...` is not empty. We detected these problematic arguments: * `needs_dots` These dots only exist to allow future extensions and should be empty. Did you misspecify an argument?
Advertisements