0% found this document useful (0 votes)

8 views

Manipulating Data in R

This document provides a comprehensive guide on manipulating data in R, focusing on reshaping data between long and wide formats, merging datasets, and performing operations by grouping variables. It includes practical examples using the dplyr and tidyr packages, as well as base R functions. The document also covers data visualization techniques using ggplot2.

Uploaded by

lowtarhkM

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views

Manipulating Data in R

Uploaded by

lowtarhkM

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

Manipulating Data in R

John Muschelli

January 7, 2016
Overview

In this module, we will show you how to:

1. Reshaping data from long (tall) to wide (fat)

2. Reshaping data from wide (fat) to long (tall)
3. Merging Data
4. Perform operations by a grouping variable
Setup

We will show you how to do each operation in base R then show

you how to use the dplyr or tidyr package to do the same
operation (if applicable).
See the “Data Wrangling Cheat Sheet using dplyr and tidyr”:

I https://www.rstudio.com/wp-content/uploads/2015/
02/data-wrangling-cheatsheet.pdf
Load the packages/libraries

library(dplyr)

Attaching package: 'dplyr'

The following objects are masked from 'package:stats':

filter, lag

The following objects are masked from 'package:base':

intersect, setdiff, setequal, union

library(tidyr)
Data used: Charm City Circulator
http://www.aejaffe.com/winterR_2016/data/Charm_City_
Circulator_Ridership.csv
Let’s read in the Charm City Circulator data:

ex_data = read.csv("http://www.aejaffe.com/winterR_2016/dat
head(ex_data, 2)

day date orangeBoardings orangeAlightings orang

1 Monday 01/11/2010 877 1027
2 Tuesday 01/12/2010 777 815
purpleBoardings purpleAlightings purpleAverage greenBoard
1 NA NA NA
2 NA NA NA
greenAlightings greenAverage bannerBoardings bannerAlight
1 NA NA NA
2 NA NA NA
bannerAverage daily
1 NA 952
Creating a Date class from a character date
The lubridate package is great for dates:

library(lubridate) # great for dates!

ex_data = mutate(ex_data, date = mdy(date))
nrow(ex_data[ is.na(ex_data$date), ])

[1] 0

head(ex_data$date)

[1] "2010-01-11 UTC" "2010-01-12 UTC" "2010-01-13 UTC" "201

[5] "2010-01-15 UTC" "2010-01-16 UTC"

class(ex_data$date)

[1] "POSIXct" "POSIXt"

Making column names a little more separated

We will use str_replace from stringr to put periods in the

column names.

library(stringr)
cn = colnames(ex_data)
cn = cn %>%
str_replace("Board", ".Board") %>%
str_replace("Alight", ".Alight") %>%
str_replace("Average", ".Average")
colnames(ex_data) = cn
Removing the daily ridership

We want to look at each ridership, and will removet the daily

column:

ex_data$daily = NULL
Reshaping data from wide (fat) to long (tall)

See http://www.cookbook-r.com/Manipulating_data/
Converting_data_between_wide_and_long_format/
Reshaping data from wide (fat) to long (tall): base R

The reshape command exists. It is a confusing function. Don’t

use it.
Reshaping data from wide (fat) to long (tall): tidyr
In tidyr, the gather function gathers columns into rows.
We want the column names into “var” variable in the output
dataset and the value in “number” variable. We then describe
which columns we want to “gather:”

long = gather(ex_data, "var", "number",

starts_with("orange"),
starts_with("purple"), starts_with("green"),
starts_with("banner"))
head(long)

day date var number

1 Monday 2010-01-11 orange.Boardings 877
2 Tuesday 2010-01-12 orange.Boardings 777
3 Wednesday 2010-01-13 orange.Boardings 1203
4 Thursday 2010-01-14 orange.Boardings 1194
5 Friday 2010-01-15 orange.Boardings 1645
6 Saturday 2010-01-16 orange.Boardings 1457
Reshaping data from wide (fat) to long (tall): tidyr
Now each var is boardings, averages, or alightings. We want to
separate these so we can have these by line.

long = separate_(long, "var", into = c("line", "type"), sep

head(long)

day date line type number

1 Monday 2010-01-11 orange Boardings 877
2 Tuesday 2010-01-12 orange Boardings 777
3 Wednesday 2010-01-13 orange Boardings 1203
4 Thursday 2010-01-14 orange Boardings 1194
5 Friday 2010-01-15 orange Boardings 1645
6 Saturday 2010-01-16 orange Boardings 1457

table(long$line)

banner green orange purple

Reshaping data from long (tall) to wide (fat): tidyr
In tidyr, the spread function spreads rows into columns. Now we
have a long data set, but we want to separate the Average,
Alightings and Boardings into different columns:

# have to remove missing days

wide = filter(long, !is.na(date))
wide = spread(wide, type, number)
head(wide)

day date line Alightings Average Boardings

1 Friday 2010-01-15 banner NA NA NA
2 Friday 2010-01-15 green NA NA NA
3 Friday 2010-01-15 orange 1643 1644 1645
4 Friday 2010-01-15 purple NA NA NA
5 Friday 2010-01-22 banner NA NA NA
6 Friday 2010-01-22 green NA NA NA
Reshaping data from long (tall) to wide (fat): tidyr
We can use rowSums to see if any values in the row is NA and keep
if the row, which is a combination of date and line type has any
non-missing data.

# wide = wide %>%

# select(Alightings, Average, Boardings) %>%
# mutate(good = rowSums(is.na(.)) > 0)
namat = !is.na(select(wide, Alightings, Average, Boardings)
head(namat)

Alightings Average Boardings

1 FALSE FALSE FALSE
2 FALSE FALSE FALSE
3 TRUE TRUE TRUE
4 FALSE FALSE FALSE
5 FALSE FALSE FALSE
6 FALSE FALSE FALSE
Reshaping data from long (tall) to wide (fat): tidyr

Now we can filter only the good rows and delete the good column.

wide = filter(wide, good) %>% select(-good)

head(wide)

day date line Alightings Average Boardings

1 Friday 2010-01-15 orange 1643 1644.0 1645
2 Friday 2010-01-22 orange 1388 1394.5 1401
3 Friday 2010-01-29 orange 1322 1332.0 1342
4 Friday 2010-02-05 orange 1204 1217.5 1231
5 Friday 2010-02-12 orange 678 671.0 664
6 Friday 2010-02-19 orange 1647 1642.0 1637
Data Merging/Append in Base R

I Merging - joining data sets together - usually on key variables,

usually “id”
I merge() is the most common way to do this with data sets
I rbind/cbind - row/column bind, respectively
I rbind is the equivalent of “appending” in Stata or “setting” in
SAS
I cbind allows you to add columns in addition to the previous
ways
I t() is a function that will transpose the data
Merging

base <- data.frame(id = 1:10, Age= seq(55,60, length=10))

base[1:2,]

id Age
1 1 55.00000
2 2 55.55556

visits <- data.frame(id = rep(1:8, 3), visit= rep(1:3, 8),

Outcome = seq(10,50, length=24))
visits[1:2,]

id visit Outcome
1 1 1 10.00000
2 2 2 11.73913
Merging

merged.data <- merge(base, visits, by="id")

merged.data[1:5,]

id Age visit Outcome

1 1 55.00000 1 10.00000
2 1 55.00000 3 23.91304
3 1 55.00000 2 37.82609
4 2 55.55556 2 11.73913
5 2 55.55556 1 25.65217

dim(merged.data)

[1] 24 4
Merging

all.data <- merge(base, visits, by="id", all=TRUE)

tail(all.data)

id Age visit Outcome

21 7 58.33333 2 48.26087
22 8 58.88889 2 22.17391
23 8 58.88889 1 36.08696
24 8 58.88889 3 50.00000
25 9 59.44444 NA NA
26 10 60.00000 NA NA

dim(all.data)

[1] 26 4
Joining in dplyr

I ?join - see different types of joining for dplyr

I Let’s look at https://www.rstudio.com/wp-content/
uploads/2015/02/data-wrangling-cheatsheet.pdf
Left Join
lj = left_join(base, visits)

Joining by: "id"

dim(lj)

[1] 26 4

tail(lj)

id Age visit Outcome

21 7 58.33333 2 48.26087
22 8 58.88889 2 22.17391
23 8 58.88889 1 36.08696
24 8 58.88889 3 50.00000
25 9 59.44444 NA NA
26 10 60.00000 NA NA
Right Join
rj = right_join(base, visits)

Joining by: "id"

dim(rj)

[1] 24 4

tail(rj)

id Age visit Outcome

19 3 56.11111 1 41.30435
20 4 56.66667 2 43.04348
21 5 57.22222 3 44.78261
22 6 57.77778 1 46.52174
23 7 58.33333 2 48.26087
24 8 58.88889 3 50.00000
Full Join
fj = full_join(base, visits)

Joining by: "id"

dim(fj)

[1] 26 4

tail(fj)

id Age visit Outcome

21 7 58.33333 2 48.26087
22 8 58.88889 2 22.17391
23 8 58.88889 1 36.08696
24 8 58.88889 3 50.00000
25 9 59.44444 NA NA
26 10 60.00000 NA NA
Perform Operations By Groups: base R

The tapply command will take in a vector (X), perform a function

(FUN) over an index (INDEX):

args(tapply)

function (X, INDEX, FUN = NULL, ..., simplify = TRUE)

NULL
Perform Operations By Groups: base R

Let’s get the mean Average ridership by line:

tapply(wide$Average, wide$line, mean, na.rm = TRUE)

banner green orange purple

827.2685 1957.7814 3033.1611 4016.9345
Perform Operations By Groups: dplyr
Let’s get the mean Average ridership by line We will use group_by
to group the data by line, then use summarize (or summarise) to
get the mean Average ridership:

gb = group_by(wide, line)
summarize(gb, mean_avg = mean(Average))

Source: local data frame [4 x 2]

line mean_avg
(chr) (dbl)
1 banner 827.2685
2 green 1957.7814
3 orange 3033.1611
4 purple 4016.9345
Perform Operations By Groups: dplyr with piping
Using piping, this is:

wide %>%
group_by(line) %>%
summarise(mean_avg = mean(Average))

Source: local data frame [4 x 2]

line mean_avg
(chr) (dbl)
1 banner 827.2685
2 green 1957.7814
3 orange 3033.1611
4 purple 4016.9345
Perform Operations By Multiple Groups: dplyr
This can easily be extended using group_by with multiple groups.
Let’s define the year of riding:

wide = wide %>% mutate(year = year(date),

month = month(date))
wide %>%
group_by(line, year) %>%
summarise(mean_avg = mean(Average))

Source: local data frame [13 x 3]

Groups: line [?]

line year mean_avg

(chr) (dbl) (dbl)
1 banner 2012 882.0929
2 banner 2013 635.3833
3 green 2011 1455.1667
4 green 2012 2028.7740
Perform Operations By Multiple Groups: dplyr
We can then easily plot each day over time:

library(ggplot2)
ggplot(aes(x = date, y = Average,
colour = line), data = wide) + geom_line()

8000

6000

line
banner
Average

green
4000
orange
purple

2000
Perform Operations By Multiple Groups: dplyr
Let’s create the middle of the month (the 15th for example), and
name it mon.

mon = wide %>%

dplyr::group_by(line, month, year) %>%
dplyr::summarise(mean_avg = mean(Average))
mon = mutate(mon,
mid_month = dmy(paste0("15-", month, "-", year
head(mon)

Source: local data frame [6 x 5]

Groups: line, month [6]

line month year mean_avg mid_month

(chr) (dbl) (dbl) (dbl) (time)
1 banner 1 2013 610.3226 2013-01-15
2 banner 2 2013 656.4643 2013-02-15
3 banner 3 2013 822.0000 2013-03-15
Perform Operations By Multiple Groups: dplyr
We can then easily plot the mean of each month to see a smoother
output:

ggplot(aes(x = mid_month,
y = mean_avg,
colour = line), data = mon) + geom_line()

5000

4000

line
mean_avg

banner
green
3000
orange
purple

2000
Bonus! Points with a smoother!
ggplot(aes(x = date, y = Average, colour = line),
data = wide) + geom_smooth(se = FALSE) +
geom_point(size = .5)

8000

6000

line
banner
Average

green
4000
orange
purple

2000

Gray Morris's Calculate with Confidence, 2nd Edition Tania N. Killian download pdf
33% (3)
Gray Morris's Calculate with Confidence, 2nd Edition Tania N. Killian download pdf
47 pages
Verzani Answers
100% (8)
Verzani Answers
94 pages
Apache Cassandra Administrator Associate - Exam Practice Tests
From Everand
Apache Cassandra Administrator Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
Dplyr Cheatsheet PDF
100% (1)
Dplyr Cheatsheet PDF
2 pages
Painless Pre-Algebra
From Everand
Painless Pre-Algebra
Barron's Educational Series
3/5 (2)
Mathematics Assignment
50% (2)
Mathematics Assignment
12 pages
6 Exercises Solutions 2009 PDF
No ratings yet
6 Exercises Solutions 2009 PDF
90 pages
MIT 302 - Statistical Computing II - Tutorial 02
No ratings yet
MIT 302 - Statistical Computing II - Tutorial 02
5 pages
Manipulating Data in R
No ratings yet
Manipulating Data in R
57 pages
BS730 Class 12
No ratings yet
BS730 Class 12
36 pages
Group Manipulation and Data Reshaping in R
No ratings yet
Group Manipulation and Data Reshaping in R
10 pages
RSTUDIO
No ratings yet
RSTUDIO
44 pages
Unit - 2: Data Manipulation With R & Data Visualization in Watson Studio
No ratings yet
Unit - 2: Data Manipulation With R & Data Visualization in Watson Studio
58 pages
R Basic and Advanced
No ratings yet
R Basic and Advanced
9 pages
Data Visualization Notes-2
No ratings yet
Data Visualization Notes-2
223 pages
2 Manipulating Processing Data
No ratings yet
2 Manipulating Processing Data
81 pages
Practical Preprocessing and Data Cleaning
No ratings yet
Practical Preprocessing and Data Cleaning
51 pages
Data Manipulation Workshop Handout
No ratings yet
Data Manipulation Workshop Handout
46 pages
Intro To Data Science Lecture 4
No ratings yet
Intro To Data Science Lecture 4
13 pages
Data Tidying With Tidyr::: Cheat Sheet
No ratings yet
Data Tidying With Tidyr::: Cheat Sheet
2 pages
Chapter 03 Wrangling
No ratings yet
Chapter 03 Wrangling
40 pages
Unit2
No ratings yet
Unit2
76 pages
vertopal.com_R_practical
No ratings yet
vertopal.com_R_practical
9 pages
Lab4_Instructions
No ratings yet
Lab4_Instructions
52 pages
Data Wrangling Cheatsheet PDF
No ratings yet
Data Wrangling Cheatsheet PDF
2 pages
Data Wrangling Cheatsheet PDF
No ratings yet
Data Wrangling Cheatsheet PDF
2 pages
Analysis Using Statistical: Introduction & Data Exploration
No ratings yet
Analysis Using Statistical: Introduction & Data Exploration
23 pages
Assignment 2 Tidyr
No ratings yet
Assignment 2 Tidyr
2 pages
Tutorial-Introduction To Dplyr
No ratings yet
Tutorial-Introduction To Dplyr
54 pages
Tidyverse - Tidyr and Dplyr
No ratings yet
Tidyverse - Tidyr and Dplyr
33 pages
R Programming
No ratings yet
R Programming
9 pages
DV Lab
No ratings yet
DV Lab
52 pages
R Dplyr Tutorial - Merge, Join, Spread PDF
No ratings yet
R Dplyr Tutorial - Merge, Join, Spread PDF
17 pages
UL2
No ratings yet
UL2
2 pages
Unit_3 (1)
No ratings yet
Unit_3 (1)
36 pages
Advanced R Data Analysis Training PDF
No ratings yet
Advanced R Data Analysis Training PDF
72 pages
Lec 6 Data Preprocessing using R
No ratings yet
Lec 6 Data Preprocessing using R
84 pages
r file code
No ratings yet
r file code
16 pages
DataCamp Week 5
No ratings yet
DataCamp Week 5
7 pages
R Studio Notes
No ratings yet
R Studio Notes
10 pages
Preprocessing - Preprocessing Your Data With R
No ratings yet
Preprocessing - Preprocessing Your Data With R
23 pages
R-Programming-Cheat-Sheet
No ratings yet
R-Programming-Cheat-Sheet
7 pages
DR - Pierpaolo-Delser - Introduction R
No ratings yet
DR - Pierpaolo-Delser - Introduction R
83 pages
MTH 4407 - Group 2 (Dr. Farid Zamani) - Lecture 6
No ratings yet
MTH 4407 - Group 2 (Dr. Farid Zamani) - Lecture 6
22 pages
Code Basics & Data Manipulation With R: Literature: Wickham & Grolemund R For Data Science Ch. 3, 16
No ratings yet
Code Basics & Data Manipulation With R: Literature: Wickham & Grolemund R For Data Science Ch. 3, 16
31 pages
Content: Dplyr, Readr, TM, Ggplot2/+ggforce/, Tidyr, Broom Dplyr
No ratings yet
Content: Dplyr, Readr, TM, Ggplot2/+ggforce/, Tidyr, Broom Dplyr
8 pages
Advanced R Programming Tidyverse Packages Notes
No ratings yet
Advanced R Programming Tidyverse Packages Notes
12 pages
Cleaning Data in R
No ratings yet
Cleaning Data in R
9 pages
R Course Own English HS
No ratings yet
R Course Own English HS
70 pages
Data Minig and Techniquezz
No ratings yet
Data Minig and Techniquezz
48 pages
SAS R::: Cheat Sheet
No ratings yet
SAS R::: Cheat Sheet
2 pages
MBA Sem 1 Unit 3 Fundamentals of R (1)
No ratings yet
MBA Sem 1 Unit 3 Fundamentals of R (1)
41 pages
Data Wrangling
No ratings yet
Data Wrangling
12 pages
Important R Codes and Notes
No ratings yet
Important R Codes and Notes
13 pages
Illuminating Data: A hands on guide to data visualization in R
From Everand
Illuminating Data: A hands on guide to data visualization in R
Eman Ahmad
No ratings yet
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
Data Structures and Algorithms with Python
From Everand
Data Structures and Algorithms with Python
Aadinath Pothuvaal
No ratings yet
300+ Python Algorithms: Mastering the Art of Problem-Solving
From Everand
300+ Python Algorithms: Mastering the Art of Problem-Solving
Hernando Abella
5/5 (1)
Design And Analysis Of Algorithm
From Everand
Design And Analysis Of Algorithm
Bhupendra Mandloi
No ratings yet
From Average To K-means
From Everand
From Average To K-means
Beam van Waardenberg
No ratings yet
Principles of Digital Electronics
From Everand
Principles of Digital Electronics
Sapana Rane
No ratings yet
Forecasting Crashes - Trading Volume, Past Returns and Condi
No ratings yet
Forecasting Crashes - Trading Volume, Past Returns and Condi
48 pages
Point and Figure University Course 191s
No ratings yet
Point and Figure University Course 191s
191 pages
R Module 12 - Functions
No ratings yet
R Module 12 - Functions
14 pages
Rstudio Installation
No ratings yet
Rstudio Installation
3 pages
PBS CPCP Admin Guide
No ratings yet
PBS CPCP Admin Guide
29 pages
Awp Unit 3
No ratings yet
Awp Unit 3
27 pages
Mazda3 - FL - English Reparatii
No ratings yet
Mazda3 - FL - English Reparatii
91 pages
CHAPTER-6 FORECASTING TECHNIQUES - Formatted PDF
No ratings yet
CHAPTER-6 FORECASTING TECHNIQUES - Formatted PDF
46 pages
Operator Overloading and Type Conversions
No ratings yet
Operator Overloading and Type Conversions
33 pages
SBBSB
No ratings yet
SBBSB
130 pages
Activation Function
No ratings yet
Activation Function
31 pages
Chapter 6 - Decision Analysis
No ratings yet
Chapter 6 - Decision Analysis
24 pages
Air Load Break Switch
No ratings yet
Air Load Break Switch
30 pages
Software Testing Exam Paper 1
No ratings yet
Software Testing Exam Paper 1
2 pages
MNP Call Flow
No ratings yet
MNP Call Flow
2 pages
Cherrypy Tutorial
50% (2)
Cherrypy Tutorial
53 pages
Method For Calculation of Lightning Risk Assessment Using Ms Excel
No ratings yet
Method For Calculation of Lightning Risk Assessment Using Ms Excel
12 pages
Bono-Energia FT Web
No ratings yet
Bono-Energia FT Web
4 pages
Course Catalogue 2023 2024
No ratings yet
Course Catalogue 2023 2024
230 pages
Celdas de Carga Hardy
100% (1)
Celdas de Carga Hardy
28 pages
Intellectual World of Al Tusi
100% (1)
Intellectual World of Al Tusi
18 pages
04 - Moving Coil Instrument
No ratings yet
04 - Moving Coil Instrument
27 pages
Abstract On Wind Energy Conversion System
67% (3)
Abstract On Wind Energy Conversion System
2 pages
AC Machines - 1 Assignments-3,4,5
No ratings yet
AC Machines - 1 Assignments-3,4,5
3 pages
Instrumentation (Finale)
No ratings yet
Instrumentation (Finale)
20 pages
J2EE Architecture Overview
100% (1)
J2EE Architecture Overview
33 pages
Module 1 Part 2
No ratings yet
Module 1 Part 2
63 pages
Scada: Fig: Simplified Block Diagram of A SCADA System
100% (1)
Scada: Fig: Simplified Block Diagram of A SCADA System
14 pages
ASTM D1586 D1586M 18e1
No ratings yet
ASTM D1586 D1586M 18e1
10 pages
Error Codes
No ratings yet
Error Codes
4 pages
Dowland Ballad Tunes and Simple Pieces in Tablature For The Guitar Complete
100% (1)
Dowland Ballad Tunes and Simple Pieces in Tablature For The Guitar Complete
15 pages

Manipulating Data in R

Uploaded by

Manipulating Data in R

Uploaded by

Manipulating Data in R

In this module, we will show you how to:

1. Reshaping data from long (tall) to wide (fat)

We will show you how to do each operation in base R then show

Attaching package: 'dplyr'

The following objects are masked from 'package:stats':

The following objects are masked from 'package:base':

intersect, setdiff, setequal, union

day date orangeBoardings orangeAlightings orang

library(lubridate) # great for dates!

[1] "2010-01-11 UTC" "2010-01-12 UTC" "2010-01-13 UTC" "201

[1] "POSIXct" "POSIXt"

We will use str_replace from stringr to put periods in the

We want to look at each ridership, and will removet the daily

The reshape command exists. It is a confusing function. Don’t

long = gather(ex_data, "var", "number",

day date var number

long = separate_(long, "var", into = c("line", "type"), sep

day date line type number

banner green orange purple

# have to remove missing days

day date line Alightings Average Boardings

# wide = wide %>%

Alightings Average Boardings

wide = filter(wide, good) %>% select(-good)

day date line Alightings Average Boardings

I Merging - joining data sets together - usually on key variables,

base <- data.frame(id = 1:10, Age= seq(55,60, length=10))

visits <- data.frame(id = rep(1:8, 3), visit= rep(1:3, 8),

merged.data <- merge(base, visits, by="id")

id Age visit Outcome

all.data <- merge(base, visits, by="id", all=TRUE)

id Age visit Outcome

I ?join - see different types of joining for dplyr

Joining by: "id"

id Age visit Outcome

Joining by: "id"

id Age visit Outcome

Joining by: "id"

id Age visit Outcome

The tapply command will take in a vector (X), perform a function

function (X, INDEX, FUN = NULL, ..., simplify = TRUE)

Let’s get the mean Average ridership by line:

tapply(wide$Average, wide$line, mean, na.rm = TRUE)

banner green orange purple

Source: local data frame [4 x 2]

Source: local data frame [4 x 2]

wide = wide %>% mutate(year = year(date),

Source: local data frame [13 x 3]

line year mean_avg

mon = wide %>%

Source: local data frame [6 x 5]

line month year mean_avg mid_month

You might also like