ProgrammingForDS16_Rdatamanipulation
ProgrammingForDS16_Rdatamanipulation
Liana Harutyunyan
Programming for Data Science
April 25, 2024
American University of Armenia
liana.harutyunyan@aua.am
1
Data Manipulation
install.packages("dplyr")
library(dplyr)
2
Data Manipulation
3
Data Manipulation
• Not necessary, but dplyr works the best with pipe like
operator from magrittr package.
• % > % operator takes the object from its left hand side
and uses it as an argument in the function on the right
hand side.
• Understand how to use it by replacing the pipe
operation with ‘then’ (in your mind, not in the code)
• example: filter(...) % > % select(...) - FILTER,
THEN on the filtered SELECT
4
Mutate
diamonds %>%
mutate(JustOne = 1,
Values = "something",
Simple = TRUE)
5
Mutate
diamonds %>%
mutate(price_discounted = price * 0.9)
diamonds %>%
mutate(price = price * 0.9,
mean_price = mean(price))
6
ifelse
Example:
7
ifelse with mutate
practice %>%
mutate(Health = ifelse(Subject == 1,
"sick",
"healthy"))
8
Filter
diamonds %>%
filter(cut == "Fair")
Will return only those rows that have ”cut” equal to ”Fair”.
Equivalent to:
diamonds[diamonds$cut == "Fair", ]
9
Filter
diamonds %>%
filter(cut == "Fair" | cut == "Good",
price <= 600)
Same as
diamonds %>%
filter(cut %in% c("Fair", "Good"),
price <= 600)
10
Select
• Select only the columns that you want to see. Gets rid of
all other columns.
• Can use columns positions or by name.
• The order in which you list the column names/positions
is the order that the columns will be displayed.
diamonds %>%
select(cut, color)
Same as:
diamonds %>%
select(1:5)
11
Select with negative sign
12
group by and summarize
data %>%
group_by(Country) %>%
summarize(m = mean(Score),
s = sd(Score),
n = n()) # calculate the count
Equivalent to Python’s:
diamonds %>%
arrange(cut)
diamonds %>%
arrange(desc(price))
14
Data Manipulation - examples set 1
15
Data Manipulation - examples set 1
15
Data Manipulation - examples set 1
15
Data Manipulation - examples set 2
16
Data Manipulation - examples set 3
17
Summary
Reading
https://bookdown.org/yih huynh/GuidetoRBook/basicdata-
management.html
Questions?
18