
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Split Long String into Vector of Substrings of Equal Sizes in R
If a vector is recorded as a single string by mistake or the file that contains the data did not separated the string in an appropriate way then we might need to split in the correct form so that we can proceed with the further analysis. This might happen when the levels of a factor variable that have equal name length are not separated. In this case, we can split the string into a vector that contain substring of equal sizes by using substring function.
Examples
Just look at these examples to understand how substring function can help us to split the string into a vector of substrings −
Factor<-"aabbccddabacadbabcbdcacbcddadbdc" substring(Factor,seq(1,nchar(Factor),2),seq(2,nchar(Factor), 2))
Output
[1] "aa" "bb" "cc" "dd" "ab" "ac" "ad" "ba" "bc" "bd" "ca" "cb" "cd" "da" "db" [16] "dc" x1<-"abcdefghijklmopqrstuvwxyz" substring(x1,seq(1,nchar(x1),2),seq(2,nchar(x1), 2)) [1] "ab" "cd" "ef" "gh" "ij" "kl" "mo" "pq" "rs" "tu" "vw" "xy" "" substring(x1,seq(1,nchar(x1),2),seq(3,nchar(x1), 2)) [1] "abc" "cde" "efg" "ghi" "ijk" "klm" "mop" "pqr" "rst" "tuv" "vwx" "xyz" [13] "" substring(x1,seq(1,nchar(x1),3),seq(3,nchar(x1), 3)) [1] "abc" "def" "ghi" "jkl" "mop" "qrs" "tuv" "wxy" "" substring(x1,seq(1,nchar(x1),4),seq(3,nchar(x1), 4)) [1] "abc" "efg" "ijk" "mop" "rst" "vwx" "" substring(x1,seq(1,nchar(x1),4),seq(4,nchar(x1), 4)) [1] "abcd" "efgh" "ijkl" "mopq" "rstu" "vwxy" "" substring(x1,seq(1,nchar(x1),4),seq(5,nchar(x1), 4)) [1] "abcde" "efghi" "ijklm" "mopqr" "rstuv" "vwxyz" "" substring(x1,seq(1,nchar(x1),5),seq(5,nchar(x1), 5)) [1] "abcde" "fghij" "klmop" "qrstu" "vwxyz" substring(x1,seq(1,nchar(x1),10),seq(5,nchar(x1), 10)) [1] "abcde" "klmop" "vwxyz" substring(x1,seq(1,nchar(x1),10),seq(10,nchar(x1), 10)) [1] "abcdefghij" "klmopqrstu" "" substring(x1,seq(1,nchar(x1),10),seq(2,nchar(x1), 10)) [1] "ab" "kl" "vw" substring(x1,seq(1,nchar(x1),10),seq(3,nchar(x1), 10)) [1] "abc" "klm" "vwx" substring(x1,seq(1,nchar(x1),10),seq(5,nchar(x1), 10)) [1] "abcde" "klmop" "vwxyz" substring(x1,seq(1,nchar(x1),2),seq(2,nchar(x1)+2-1, 2)) [1] "ab" "cd" "ef" "gh" "ij" "kl" "mo" "pq" "rs" "tu" "vw" "xy" "z" substring(x1,seq(1,nchar(x1),4),seq(4,nchar(x1)+4-1, 4)) [1] "abcd" "efgh" "ijkl" "mopq" "rstu" "vwxy" "z" substring(x1,seq(1,nchar(x1),3),seq(4,nchar(x1)+4-1, 3)) [1] "abcd" "defg" "ghij" "jklm" "mopq" "qrst" "tuvw" "wxyz" "z" substring(x1,seq(1,nchar(x1),5),seq(4,nchar(x1)+4-1, 5)) [1] "abcd" "fghi" "klmo" "qrst" "vwxy" substring(x1,seq(1,nchar(x1),2),seq(4,nchar(x1)+4-1, 2)) [1] "abcd" "cdef" "efgh" "ghij" "ijkl" "klmo" "mopq" "pqrs" "rstu" "tuvw" [11] "vwxy" "xyz" "z" substring(x1,seq(1,nchar(x1),2),seq(5,nchar(x1)+5-1, 2)) [1] "abcde" "cdefg" "efghi" "ghijk" "ijklm" "klmop" "mopqr" "pqrst" "rstuv" [10] "tuvwx" "vwxyz" "xyz" "z"
Advertisements