Advanced R Programming Tidyverse Packages Notes
Advanced R Programming Tidyverse Packages Notes
There are eight core Tidyverse packages namely ggplot2, dplyr, tidyr,
readr, purrr, tibble, stringr that are mentioned in this article. All of these
packages are loaded automatically at once with the
install.packages(“tidyverse”) command.
Tidyverse Packages in R following:
1. Data Visualization and Exploration
ggplot2
2. Data Wrangling and Transformation
dplyr
tidyr
stringr
Note: These above all packages are comes under tidyvers package so, there
is no need to install all the packages separately.
1) tidyr: tidyr is a data cleaning library in R which helps to
create tidy data. Tidyr data means that all the data cells have
a single value with each of the data columns being a variable
and the data rows being an observation.
One of the most important packages in R is the tidyr package. The sole
purpose of the tidyr package is to simplify the process of creating tidy
data. Tidy data describes a standard way of storing data that is used
wherever possible throughout the tidyverse. If you once make sure
that your data is tidy, you’ll spend less time punching with the tools and
more time working on your analysis.
g) fill() function: Fill function is used to fill in the missing values in selected
columns using the previous entry. This is useful in the common output
format where values are not repeated, they’re recorded each time they
change. Missing values are replaced in atomic vectors; NULL is replaced
in the list.
Example:
df=data.frame(Month=1:6,
year=c(2000,rep(NA,5)))
df
df %>% fill(year)
h) full seq function: It basically fills the missing values in a vector which
should have been observed but weren’t. The vector should be numeric.
Example:
num_vec<-c(3,7,9,14,19,20)
seq<-full_seq(num_vec,1)
seq
g) drop_na function: This function drops rows containing missing values.
Example:
drop_df<-tibble(s.no=c(1:10),
Name=c("Jhon","Smith","Perer","luke","King",rep(NA,5)))
drop_df
dfg<-drop_df %>% drop_na(Name)
dfg
h) replace_na() function: This function is used to replaces missing values.
Example:
drop_df<-tibble(s.no=c(1:10),
Name=c("Jhon","Smith","Perer","luke","King",rep(NA,5)))
Tibbles package have nice printing method that show only the first 10 rows
and all the Coolum that fit on the screen. This is useful when we work with
large data.
Example:
tf<-tibble(x=letters,y=1:26,z=sample(50,26))
View(tf)
z=sample(0,26)
#check tibble: Check weather the particular set of data is tibble or not.
Example:
is_tibble(mtcars)
# If we want to make a tibble format
Example:
is_tibble(as_tibble(mtcars))
# glimpse: glimpse is the function is used to get all the coloum name into
row.
Example:
glimpse(mtcars)
#enframe: enframe function is used to convert dataframe into tibble
Example:
enframe(1:3)
# deframe: We get the value back to frame
Example:
v<-deframe(tibble(a=1:3))
is_tibble(v)
# Mathmetical operations.
mat<-tibble(x=1:5,y=1,z=x^+y)
mat
# Add rows
Example:
df<-tibble(x=1:3,y=3:1)
df %>% add_row(x=4,y=0)
df %>% add_row(x=4:5,y=0,.before=2)
Note: If we want to add rows before second row then use .before option.
# Add coloum
Example:
df<-tibble(x=1:3,y=3:1)
df %>% add_column(z=-1:1,w=0)
# If we want to print last 3 dates then use negative sign with number.
Example:
data_t <- tibble(a = 1:3, b = letters[1:3],
c = Sys.Date() - 1:3)
print(data_t)
# If we want to print next coming week 3 dates then use positive sign with
number.
Example:
data1 <- data.frame(a = 1:3, b = letters[1:3],
c = Sys.Date() + 1:3)
print(data1)
3) purr package: Purrr is a popular R Programming package that provides
a consistent and powerful set of tools for working with functions and
vectors. It was developed by Hadley Wickham and is part of the tidyverse
suite of packages. Purrr is an essential package for functional
programming in R.
Purrr provides a set of functions that are designed to work with functional
programming concepts, such as mapping, filtering, and reducing. These
functions are designed to work with lists, data frames, and other objects,
making it easier to work with complex data structures.
The main functions provided by purrr are map(), walk(), reduce(),
accumulate(), and compose() etc. These functions can be used for a
variety of tasks, such as applying a function to each element of a list,
filtering a list based on a condition, and reducing a list to a single value.
Example1:
my_list<-list(
c(1,2,6),
c(4,7,1),
c(9,1,5)
)
#Find the mean of each vector by using map function
my_list %>% map(mean)
map(my_list,mean)
Example2:
df<-iris[1:4]
means<-map(df,mean)
means
4) readr packages: readr can read different kinds of file format using different
functions, namely read_csv() for comma -separated files, read_tsv() for
tab_seprated files,read_table() for tabular file,read_delim() for delimited file .
This readr library provides a simple and speedy method to read rectangular
data such as that with file format tsv,csv,delim
Note: before working with readr package first we need to set the working
directory
getwd()
my_path<-"E:\\R programme\\R.Directory\\CLASS (2).CSV"
dset<-read_csv(my_path,"CLASS (2).CSV")
dset
print(str_c(dset))
#class of dataset
dset %>% class()
#View
dset %>% View()
#Rename of dset column
datas<-dset %>% rename(Gen=X2)
datas
# Delete the coloum
datas<-dset %>% select(-X3,-X4)
datas
5) stringr: stringr is a library that has many functions used for data
cleaning and data preparation tasks. It is also designed for working with
strings and has many functions that make this an easy process.
All of the functions in stringr start with str and they take a string vector as
their first argument. Some of these functions include str_detect(),
str_extract(), str_match(), str_count(), str_replace(), str_subset(), etc. If
you want to install stringr, the best method is to install the tidyverse using:
Example1:
mystring<-"Data is the new Science"
str_sub(mystring,1,4)
Example2:
str_sub(mystring,17,-1)<-"art"
mystring
e) str_split(): This function is used to split string into pieces.
Example
str_split(mystring,pattern=" ")
f) str_to_lower: This function is used to convert string into low case.
Example
str_to_lower(c("ABC","JKL"))
g)str_to_upper: This function is used to convert string into upcase case.
Example
str_to_upper(c("abc","mno"))
h)str_to_title: This function is used to convert string into prop case
Example
str_to_title(c("abnm","word"))
6) tribble(): Tribble function is used to creating a row wise readable tibble in R.
Example
cat<-tribble(~test_1, ~test_2, ~test_3, ~test_4,
56,67,78,89,
54,67,21,20,
89,43,21,29,
23,24,25,25)
print(cat)