
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Split Data Frame in R into Multiple Parts Randomly
When a data frame is large, we can split it into multiple parts randomly. This might be required when we want to analyze the data partially. We can do this with the help of split function and sample function to select the values randomly.
Example
Consider the trees data in base R −
> str(trees) 'data.frame': 31 obs. of 3 variables: $ Girth : num 8.3 8.6 8.8 10.5 10.7 10.8 11 11 11.1 11.2 ... $ Height: num 70 65 63 72 81 83 66 75 80 75 ... $ Volume: num 10.3 10.3 10.2 16.4 18.8 19.7 15.6 18.2 22.6 19.9 ...
Splitting the trees data in three parts −
> split(trees, sample(rep(1:3,times=c(10,10,11)))) $`1` Girth Height Volume 2 8.6 65 10.3 3 8.8 63 10.2 10 11.2 75 19.9 12 11.4 76 21.0 13 11.4 76 21.4 16 12.9 74 22.2 21 14.0 78 34.5 22 14.2 80 31.7 25 16.3 77 42.6 26 17.3 81 55.4 $`2` Girth Height Volume 5 10.7 81 18.8 6 10.8 83 19.7 8 11.0 75 18.2 11 11.3 79 24.2 14 11.7 69 21.3 17 12.9 85 33.8 20 13.8 64 24.9 28 17.9 80 58.3 29 18.0 80 51.5 30 18.0 80 51.0 $`3` Girth Height Volume 1 8.3 70 10.3 4 10.5 72 16.4 7 11.0 66 15.6 9 11.1 80 22.6 15 12.0 75 19.1 18 13.3 86 27.4 19 13.7 71 25.7 23 14.5 74 36.3 24 16.0 72 38.3 27 17.5 82 55.7 31 20.6 87 77.0
Consider the women data in base R −
> str(women) 'data.frame': 15 obs. of 2 variables: $ height: num 58 59 60 61 62 63 64 65 66 67 ... $ weight: num 115 117 120 123 126 129 132 135 139 142 ...
Splitting the women data in two parts −
> split(women, sample(rep(1:2,times=c(10,5)))) $`1` height weight 2 59 117 4 61 123 5 62 126 6 63 129 7 64 132 9 66 139 11 68 146 12 69 150 14 71 159 15 72 164 $`2` height weight 1 58 115 3 60 120 8 65 135 10 67 142 13 70 154
Advertisements