R For Data Science - Tidyverse For Beginners (Ggplot2, Dplyr, Tidyr, Readr, Purr, Tibble, Stringr, Forcats) PDF

The document provides a cheat sheet on using the tidyverse collection of R packages, which includes dplyr for data manipulation, ggplot2 for data visualization, and other packages like tidyr and purrr. It gives examples of using functions from dplyr like filter(), arrange(), and mutate() to select, sort, and modify data. Additionally, it demonstrates how to create scatter plots and line graphs with ggplot2 by mapping variables to aesthetics and using geoms like geom_point() and geom_line().

Uploaded by

Jose AG

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

166 views1 page

R For Data Science - Tidyverse For Beginners (Ggplot2, Dplyr, Tidyr, Readr, Purr, Tibble, Stringr, Forcats) PDF

Uploaded by

Jose AG

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1

R For Data Science Cheat Sheet dplyr ggplot2

Tidyverse for Beginners Filter Scatter plot

Learn More R for Data Science Interactively at www.datacamp.com filter() allows you to select a subset of rows in a data frame. Scatter plots allow you to compare two variables within your data. To do this with
ggplot2, you use geom_point()
> iris %>% Select iris data of species
filter(Species=="virginica") "virginica" > iris_small <- iris %>%
> iris %>% Select iris data of species filter(Sepal.Length > 5)
Tidyverse filter(Species=="virginica", "virginica" and sepal length > ggplot(iris_small, aes(x=Petal.Length,
y=Petal.Width)) +
Compare petal
width and length
Sepal.Length > 6) greater than 6.
The tidyverse is a powerful collection of R packages that are actually geom_point()
data tools for transforming and visualizing data. All packages of the
tidyverse share an underlying philosophy and common APIs. Arrange Additional Aesthetics
arrange() sorts the observations in a dataset in ascending or descending order • Color
The core packages are: based on one of its variables. > ggplot(iris_small, aes(x=Petal.Length,
y=Petal.Width,
• ggplot2, which implements the grammar of graphics. You can use it > iris %>% Sort in ascending order of color=Species)) +
to visualize your data. arrange(Sepal.Length) sepal length geom_point()
> iris %>% Sort in descending order of
• dplyr is a grammar of data manipulation. You can use it to solve the arrange(desc(Sepal.Length)) sepal length • Size
most common data manipulation challenges. > ggplot(iris_small, aes(x=Petal.Length,
Combine multiple dplyr verbs in a row with the pipe operator %>%: y=Petal.Width,
color=Species,
• tidyr helps you to create tidy data or data where each variable is in a > iris %>% Filter for species "virginica"
size=Sepal.Length)) +
column, each observation is a row end each value is a cell. filter(Species=="virginica") %>% then arrange in descending
geom_point()
arrange(desc(Sepal.Length)) order of sepal length
• readr is a fast and friendly way to read rectangular data. Faceting
Mutate > ggplot(iris_small, aes(x=Petal.Length,
y=Petal.Width)) +
• purrr enhances R’s functional programming (FP) toolkit by providing a mutate() allows you to update or create new columns of a data frame.
geom_point()+
complete and consistent set of tools for working with functions and facet_wrap(~Species)
vectors. > iris %>% Change Sepal.Length to be
mutate(Sepal.Length=Sepal.Length*10) in millimeters
• tibble is a modern re-imaginging of the data frame. > iris %>% Create a new column Line Plots
mutate(SLMm=Sepal.Length*10) called SLMm
> by_year <- gapminder %>%
Combine the verbs filter(), arrange(), and mutate(): group_by(year) %>%
• stringr provides a cohesive set of functions designed to make summarize(medianGdpPerCap=median(gdpPercap))
> iris %>%
working with strings as easy as posssible > ggplot(by_year, aes(x=year,
filter(Species=="Virginica") %>%
y=medianGdpPerCap))+
mutate(SLMm=Sepal.Length*10) %>% geom_line()+
• forcats provide a suite of useful tools that solve common problems arrange(desc(SLMm)) expand_limits(y=0)
with factors.
Summarize Bar Plots
You can install the complete tidyverse with:
> install.packages("tidyverse") summarize() allows you to turn many observations into a single data point.
> by_species <- iris %>%
> iris %>% Summarize to find the filter(Sepal.Length>6) %>%
Then, load the core tidyverse and make it available in your current R summarize(medianSL=median(Sepal.Length)) median sepal length group_by(Species) %>%
session by running: > iris %>% Filter for virginica then summarize(medianPL=median(Petal.Length))
> library(tidyverse) filter(Species=="virginica") %>% summarize the median > ggplot(by_species, aes(x=Species,
summarize(medianSL=median(Sepal.Length)) sepal length y=medianPL)) +
Note: there are many other tidyverse packages with more specialised usage. They are not geom_col()
loaded automatically with library(tidyverse), so you’ll need to load each one with its own call You can also summarize multiple variables at once:
to library().
> iris %>% Histograms
Useful Functions filter(Species=="virginica") %>%
summarize(medianSL=median(Sepal.Length), > ggplot(iris_small, aes(x=Petal.Length))+
> tidyverse_conflicts() Conflicts between tidyverse and other maxSL=max(Sepal.Length)) geom_histogram()
packages
> tidyverse_deps() List all tidyverse dependencies group_by() allows you to summarize within groups instead of summarizing the
> tidyverse_logo() Get tidyverse logo, using ASCII or unicode entire dataset:
characters
> tidyverse_packages() List all tidyverse packages
> iris %>% Find median and max Box Plots
group_by(Species) %>% sepal length of each
> tidyverse_update() Update tidyverse packages summarize(medianSL=median(Sepal.Length), species > ggplot(iris_small, aes(x=Species,
maxSL=max(Sepal.Length)) y=Sepal.Width))+
Loading in the data > iris %>%
filter(Sepal.Length>6) %>%
Find median and max
petal length of each
geom_boxplot()

> library(datasets) Load the datasets package group_by(Species) %>% species with sepal
> library(gapminder) Load the gapminder package summarize(medianPL=median(Petal.Length), length > 6
> attach(iris) Attach iris data to the R search path maxPL=max(Petal.Length)) DataCamp
Learn R for Data Science Interactively

EDA With R Lab Manual
No ratings yet
EDA With R Lab Manual
110 pages
Intro Ggplot2-2
No ratings yet
Intro Ggplot2-2
50 pages
Data Visualization With Ggplot2: Sca!er Plots
No ratings yet
Data Visualization With Ggplot2: Sca!er Plots
54 pages
Summarizing Data
No ratings yet
Summarizing Data
13 pages
Data Visualization Using R & Ggplot2: Karthik Ram October 6, 2013
No ratings yet
Data Visualization Using R & Ggplot2: Karthik Ram October 6, 2013
78 pages
Lecture 2 Data Presentation
No ratings yet
Lecture 2 Data Presentation
18 pages
McCullagh - GLM
100% (11)
McCullagh - GLM
526 pages
IRIS Commands Practice
No ratings yet
IRIS Commands Practice
10 pages
RYAN, THOMAS P. - (Wiley Series in Probability and Statistics) Modern Regression Methods - (2
No ratings yet
RYAN, THOMAS P. - (Wiley Series in Probability and Statistics) Modern Regression Methods - (2
658 pages
LMoP - Monster Cards
100% (3)
LMoP - Monster Cards
6 pages
Ggplot 2: Elegant Graphics For Data Analysis. Second Edition.
No ratings yet
Ggplot 2: Elegant Graphics For Data Analysis. Second Edition.
277 pages
An Introduction To Data Analysis in R - 9783030489977 PDF
100% (3)
An Introduction To Data Analysis in R - 9783030489977 PDF
289 pages
Dplyr Cheatsheet PDF
100% (1)
Dplyr Cheatsheet PDF
2 pages
Bootstrap Methods and Their Application
100% (1)
Bootstrap Methods and Their Application
596 pages
Data Visualization in R Sem-III 2021 PDF
No ratings yet
Data Visualization in R Sem-III 2021 PDF
57 pages
Data Wrangling Cheatsheet PDF
No ratings yet
Data Wrangling Cheatsheet PDF
2 pages
Tidyverse Cheat Sheet
No ratings yet
Tidyverse Cheat Sheet
1 page
Data Visualization in R - With Cheat Sheets PDF
100% (1)
Data Visualization in R - With Cheat Sheets PDF
62 pages
Multilevel Modeling Using R
No ratings yet
Multilevel Modeling Using R
253 pages
Epidemic Modelling
100% (3)
Epidemic Modelling
75 pages
R For Data Science: Dplyr Ggplot2
No ratings yet
R For Data Science: Dplyr Ggplot2
1 page
R Programs
No ratings yet
R Programs
30 pages
Beginning R, 2nd Edition
100% (5)
Beginning R, 2nd Edition
337 pages
(Use R!) Hadley Wickham (Auth.) - Ggplot2 - Elegant Graphics For Data Analysis-Springer International Publishing (2016) PDF
100% (5)
(Use R!) Hadley Wickham (Auth.) - Ggplot2 - Elegant Graphics For Data Analysis-Springer International Publishing (2016) PDF
268 pages
2013 Book BayesianAndFrequentistRegressi PDF
No ratings yet
2013 Book BayesianAndFrequentistRegressi PDF
700 pages
Data Transformation With Dplyr - Cheatsheet
100% (1)
Data Transformation With Dplyr - Cheatsheet
2 pages
R Programming
No ratings yet
R Programming
4 pages
Ggplot2 Cheatsheet 2.0
No ratings yet
Ggplot2 Cheatsheet 2.0
2 pages
Data Wrangling Cheatsheet PDF
No ratings yet
Data Wrangling Cheatsheet PDF
2 pages
Introduction To Econometrics With R
No ratings yet
Introduction To Econometrics With R
400 pages
Beautiful Graphics in R
No ratings yet
Beautiful Graphics in R
238 pages
Using R For Data Preprocessing, Exploratory Analysis, Visualization
No ratings yet
Using R For Data Preprocessing, Exploratory Analysis, Visualization
7 pages
STATS LAB Basics of R PDF
No ratings yet
STATS LAB Basics of R PDF
77 pages
Missing and Modified Data in Nonparametric Estimation
100% (2)
Missing and Modified Data in Nonparametric Estimation
465 pages
Ggplot2 Elegant Graphics For Data Analysis (2016, Springer) PDF
No ratings yet
Ggplot2 Elegant Graphics For Data Analysis (2016, Springer) PDF
281 pages
R For Statistics PDF
90% (10)
R For Statistics PDF
312 pages
Longitudinal PDF
No ratings yet
Longitudinal PDF
664 pages
Introduction To Survival Analysis: Lecture Notes
No ratings yet
Introduction To Survival Analysis: Lecture Notes
28 pages
Joseph M. Hilbe - Practical Guide To Logistic Regression (2016, Taylor & Francis)
No ratings yet
Joseph M. Hilbe - Practical Guide To Logistic Regression (2016, Taylor & Francis)
162 pages
Multilevel Modeling Using R - Finch Bolin Kelley
100% (2)
Multilevel Modeling Using R - Finch Bolin Kelley
82 pages
Modelos de Fragilidad en El Análisis de Supervivencia PDF
No ratings yet
Modelos de Fragilidad en El Análisis de Supervivencia PDF
320 pages
(FreeCourseWeb - Com) 1493997599
100% (1)
(FreeCourseWeb - Com) 1493997599
386 pages
Generalized Linear Models
100% (9)
Generalized Linear Models
243 pages
Lme4: Mixed-Effects Modeling With R
No ratings yet
Lme4: Mixed-Effects Modeling With R
145 pages
Data Wrangling With R
91% (11)
Data Wrangling With R
237 pages
Computational Statistics With R
100% (1)
Computational Statistics With R
125 pages
Survival Plots SURVMINER Package Tutorial
No ratings yet
Survival Plots SURVMINER Package Tutorial
5 pages
Penalized Regression
No ratings yet
Penalized Regression
19 pages
R Markdown
No ratings yet
R Markdown
15 pages
Gelman - Data Analysis With Regressions and MultiLevel Hierarchical Models
No ratings yet
Gelman - Data Analysis With Regressions and MultiLevel Hierarchical Models
11 pages
From GLM To GLIMMIX-Which Model To Choose
No ratings yet
From GLM To GLIMMIX-Which Model To Choose
7 pages
(Use R!) Keon-Woong Moon - Learn Ggplot2 Using Shiny App (2017, Springer) PDF
100% (3)
(Use R!) Keon-Woong Moon - Learn Ggplot2 Using Shiny App (2017, Springer) PDF
356 pages
Dplyr Tutorial
100% (1)
Dplyr Tutorial
22 pages
Useful Stata Commands
No ratings yet
Useful Stata Commands
48 pages
Introduction To Cox Regression: Kristin Sainani Ph.D. Stanford University Department of Health Research and Policy
No ratings yet
Introduction To Cox Regression: Kristin Sainani Ph.D. Stanford University Department of Health Research and Policy
62 pages
R Programming For NGS Data Analysis
No ratings yet
R Programming For NGS Data Analysis
5 pages
Multinomial Logistic Regression - R Data Analysis Examples - IDRE Stats
No ratings yet
Multinomial Logistic Regression - R Data Analysis Examples - IDRE Stats
8 pages
Ggplot2 Cheatsheet 2.1
No ratings yet
Ggplot2 Cheatsheet 2.1
2 pages
BonehShoup 0 4
No ratings yet
BonehShoup 0 4
818 pages
Graphing Stata (MIT)
No ratings yet
Graphing Stata (MIT)
56 pages
Cert - Ddal050801
No ratings yet
Cert - Ddal050801
2 pages
Tri Axis Patch Editor Read It
No ratings yet
Tri Axis Patch Editor Read It
2 pages
We Be Goblins Free!: Pathfinder Module
No ratings yet
We Be Goblins Free!: Pathfinder Module
3 pages
Cert - Ddal02
No ratings yet
Cert - Ddal02
2 pages
Baldur's Gate - Medley
No ratings yet
Baldur's Gate - Medley
3 pages
Contributions Report: Miño-Sil River Basin Management Plan
No ratings yet
Contributions Report: Miño-Sil River Basin Management Plan
225 pages
Contributions Report: Miño-Sil River Basin Management Plan
No ratings yet
Contributions Report: Miño-Sil River Basin Management Plan
225 pages
John Petrucci Touring Rack System: Mesa Mesa Mesa
No ratings yet
John Petrucci Touring Rack System: Mesa Mesa Mesa
1 page
General Store: Buying / Selling Cap
No ratings yet
General Store: Buying / Selling Cap
2 pages
Young Guitar - Kiko Loureiro - Fiery GuitarWorks (2014!09!07 13-42-40 UTC)
No ratings yet
Young Guitar - Kiko Loureiro - Fiery GuitarWorks (2014!09!07 13-42-40 UTC)
6 pages
DMG 286
No ratings yet
DMG 286
1 page

R For Data Science - Tidyverse For Beginners (Ggplot2, Dplyr, Tidyr, Readr, Purr, Tibble, Stringr, Forcats) PDF

Uploaded by

R For Data Science - Tidyverse For Beginners (Ggplot2, Dplyr, Tidyr, Readr, Purr, Tibble, Stringr, Forcats) PDF

Uploaded by

R For Data Science Cheat Sheet dplyr ggplot2

Tidyverse for Beginners Filter Scatter plot

You might also like