
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Find Mean of Multiple Columns Based on Character Column in R
If we have a character column that means we are more likely to have duplicated values in that column hence finding the mean of numerical columns based on the values in character column cannot be done directly. For this purpose, we can use aggregate function as shown in the below examples.
Example1
Consider the below data frame −
set.seed(214) x1<−sample(c("A","B","C"),20,replace=TRUE) x2<−rnorm(20,2,0.25) x3<−rnorm(20,524,32.14) x4<−rnorm(20,0.4,0.007) df1<−data.frame(x1,x2,x3,x4) df1
Output
x1 x2 x3 x4 1 C 1.798039 483.1139 0.4105840 2 A 2.044579 574.3082 0.3970706 3 B 1.563161 517.0169 0.3983528 4 A 1.722058 546.6778 0.4049876 5 C 2.279675 527.9437 0.3959229 6 C 1.721763 549.1909 0.4096398 7 A 1.903552 486.4005 0.4091737 8 C 2.268382 502.0860 0.3998178 9 B 1.637324 540.1949 0.3970408 10 B 2.204911 433.7537 0.4003291 11 A 2.124006 521.2464 0.3935380 12 C 1.830028 525.4325 0.3937420 13 C 1.353421 553.7063 0.3974116 14 B 2.067984 547.9531 0.3958582 15 C 1.849775 537.0034 0.3986610 16 C 1.763957 531.4417 0.4015034 17 A 1.932413 545.7784 0.4215530 18 B 2.147503 491.7046 0.3918060 19 C 2.222639 563.4852 0.3902110 20 A 1.979280 483.2847 0.3820964
Finding the column means based on values in x1 −
Example
aggregate(.~x1,data=df1,mean)
Output
x1 x2 x3 x4 1 A 1.950981 526.2827 0.4014032 2 B 1.924176 506.1246 0.3966774 3 C 1.898631 530.3782 0.3997215
Example2
y1<−sample(c("Male","Female"),20,replace=TRUE) y2<−rpois(20,2) y3<−rpois(20,5) y4<−rpois(20,3) y5<−rpois(20,2) y6<−rpois(20,1) y7<−rpois(20,4) df2<−data.frame(y1,y2,y3,y4,y5,y6,y7) df2
Output
y1 y2 y3 y4 y5 y6 y7 1 Male 1 6 1 1 1 4 2 Male 4 2 1 4 0 5 3 Male 2 6 5 3 1 2 4 Male 4 4 0 3 2 0 5 Male 1 2 6 2 2 2 6 Female 5 4 2 4 0 5 7 Male 2 4 3 1 2 3 8 Male 3 8 1 2 1 5 9 Male 0 3 2 1 2 5 10 Female 3 5 1 2 1 3 11 Female 2 5 3 2 1 2 12 Female 3 8 1 0 1 6 13 Female 1 2 5 2 1 3 14 Male 2 7 1 1 0 3 15 Male 3 5 2 4 2 4 16 Female 5 6 2 4 1 0 17 Female 2 3 1 0 1 3 18 Male 2 6 3 0 1 6 19 Female 2 6 5 1 0 0 20 Female 2 6 4 5 0 3
Finding the column means based on values in y1 −
aggregate(.~y1,data=df2,mean)
Output
y1 y2 y3 y4 y5 y6 y7 1 Female 2.777778 5.000000 2.666667 2.222222 0.6666667 2.777778 2 Male 2.181818 4.818182 2.272727 2.000000 1.2727273 3.545455
Advertisements