0% found this document useful (0 votes)
233 views

R Programming Cheat Sheet

This document provides an overview of the R programming language and covers topics such as data types, basic operations, data manipulation, visualization, control structures, importing/exporting data, functions, and graphics. It describes commonly used functions and packages for working with data, performing calculations, reshaping data, plotting, and more in R. The document is intended as an ultimate guide to using R for data analysis and visualization.

Uploaded by

Hassan Bakary
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
233 views

R Programming Cheat Sheet

This document provides an overview of the R programming language and covers topics such as data types, basic operations, data manipulation, visualization, control structures, importing/exporting data, functions, and graphics. It describes commonly used functions and packages for working with data, performing calculations, reshaping data, plotting, and more in R. The document is intended as an ultimate guide to using R for data analysis and visualization.

Uploaded by

Hassan Bakary
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

R Programming Ultimate Guide

Data Types
┣ Numeric : Real or decimal numbers

┣ Integer : Whole numbers

┣ Character : Text strings

┣ Factor : Categorical variables with levels

┣ Date : Dates (YYYY-MM-DD format)

┣ POSIXct : Dates and times (seconds since 1970-01-01 00:00:00 UTC)

┣ Logical : True or false values

┣ List : Collection of objects

┗ Data frame : Tabular data with columns of different types

Basic Operations
┣ x + y : Addition

┣ x - y : Subtraction

┣ x * y : Multiplication

┣ x / y : Division

┣ x %/% y : Integer division

┣ x ^ y : Exponentiation

┣ sqrt(x) : Square root

┣ abs(x) : Absolute value

┣ ceiling(x) : Smallest integer greater than or equal to x

┣ floor(x) : Largest integer less than or equal to x

┣ round(x, digits) : Round x to the specified number of digits

┣ trunc(x) : Truncate x to an integer

┣ min(x, y) : Minimum value of x and y

┗ max(x, y) : Maximum value of x and y

By: Waleed Mousa


R Programming Ultimate Guide
Data Manipulation and Visualization
┣ install.packages("package_name") : Install a package

┣ library(package_name) : Load a package

┣ read.csv("filename.csv") : Read a CSV file

┣ read.table("filename.txt") : Read a text file

┣ write.csv(data, "filename.csv") : Write data to a CSV file

┣ write.table(data, "filename.txt") : Write data to a text file

┣ dim(data) : Get the dimensions of a data frame

┣ head(data) : View the first few rows of a data frame

┣ tail(data) : View the last few rows of a data frame

┣ summary(data) : View summary statistics of a data frame

┣ subset(data, subset) : Subset a data frame based on a condition

┣ merge(data1, data2) : Merge two data frames

┣ aggregate(data, by) : Aggregate data by a specified variable

┣ tapply(data, by) : Apply a function to subsets of data based on a specified


variable

┣ transform(data, new_var = f(var)) : Add a new variable to a data frame


based

┣ sort(data, decreasing = FALSE) : Sort a data frame by one or more


variables

┣ order(data, decreasing = FALSE) : Order a data frame by one or more


variables

┣ unique(data) : Remove duplicate rows from a data frame

┣ duplicated(data) : Identify duplicate rows in a data frame

┣ which(data > value) : Find the indices of values that meet a condition

┣ is.na(data) : Identify missing values in a data frame

┣ na.omit(data) : Remove missing values from a data frame

┣ complete.cases(data) : Identify complete cases in a data frame


By: Waleed Mousa
R Programming Ultimate Guide
┣ reshape(data, idvar, timevar, direction) : Reshape a data frame from long
to wide or vice versa

┣ melt(data, id.vars, measure.vars) : Reshape a data frame from wide to long

┗ dcast(data, formula, fun.aggregate) : Reshape a data frame using an


aggregation function

┣ library(dplyr) : Load the dplyr package for data manipulation

┣ library(tidyr) : Load the tidyr package for data manipulation

┣ library(ggplot2) : Load the ggplot2 package for graphics

┣ library(lubridate) : Load the lubridate package for working with dates and
times

┣ library(stringr) : Load the stringr package for string manipulation

┣ select(data, variables) : Select variables from a data frame

┣ filter(data, condition) : Filter rows from a data frame based on a


condition

┣ group_by(data, variable) : Group a data frame by a variable

┣ summarise(data, variable = function) : Apply a function to a variable in a


grouped data frame

┣ mutate(data, variable = function) : Create a new variable in a data frame

┣ arrange(data, variables) : Arrange rows in a data frame by variables

┣ gather(data, key, value, columns) : Convert multiple columns into key-


value pairs

┣ spread(data, key, value) : Convert key-value pairs into multiple columns

┣ separate(data, column, into = c("new1", "new2", ...)) : Separate a column


into multiple columns

┣ unite(data, new_column, columns, sep = "") : Combine multiple columns into


a single column

┣ merge(data1, data2, by = "variable") : Merge two data frames by a variable

┣ ggplot(data, aes(x, y)) + geom_*(...) : Create a plot using ggplot2

┣ library(ggmap) : Load the ggmap package for working with maps

By: Waleed Mousa


R Programming Ultimate Guide
┣ get_map(location) : Get a map for a location

┣ ggmap(map) + geom_*(...) : Create a map plot using ggmap

┣ library(ggvis) : Load the ggvis package for interactive graphics

┣ ggvis(data, ~x, ~y) %>% layer_*(...) %>% ... : Create an interactive plot
using ggvis

┣ library(plotly) : Load the plotly package for interactive graphics

┣ plot_ly(data, x = ~x, y = ~y, ...) %>% add_*() %>% ... : Create an


interactive plot using plotly

┗ library(leaflet) : Load the leaflet package for interactive maps

Control Structures
┣ if (condition) {expression} : Execute an expression if a condition is true

┣ ifelse(condition, true_expression, false_expression) : Execute one


expression if a condition is true and another if it is false

┣ for (variable in sequence) {expression} : Execute an expression for each


element in a sequence

┣ while (condition) {expression} : Execute an expression while a condition


is true

┣ repeat {expression} : Execute an expression indefinitely until a break


statement is encountered

┣ break : Exit a loop

┣ next : Skip an iteration in a loop

┣ return(value) : Return a value from a function

┣ source("filename.R") : Run R code from a file

┣ setwd("path") : Set the working directory

┗ getwd() : Get the current working directory

By: Waleed Mousa


R Programming Ultimate Guide
Data Import and Export
┣ library(readr) : Load the readr package for data import

┣ library(readxl) : Load the readxl package for Excel files

┣ library(haven) : Load the haven package for SPSS, SAS, and Stata files

┣ library(dplyr) : Load the dplyr package for data manipulation

┣ library(tidyr) : Load the tidyr package for data manipulation

┣ library(tidyverse) : Load the tidyverse package for data manipulation and


visualization

┣ write.csv(data, file) : Export data as a CSV file

┣ write_excel_csv2(data, file) : Export data as an Excel file

┣ write.table(data, file) : Export data as a text file

┣ read.csv(file) : Import data from a CSV file

┣ read_excel(file) : Import data from an Excel file

┣ read_spss(file) : Import data from an SPSS file

┣ read_sas(file) : Import data from a SAS file

┣ read_stata(file) : Import data from a Stata file

┣ separate(data, column, into = c("new1", "new2", ...)) : Separate a column


into multiple columns

┣ gather(data, key, value, columns) : Convert multiple columns into key-


value pairs

┗ spread(data, key, value) : Convert key-value pairs into multiple columns

By: Waleed Mousa


R Programming Ultimate Guide
Functions
┣ function(arguments) {expression} : Define a function

┣ return(value) : Return a value from a function

┣ formals(function) : Get the formal arguments of a function

┣ missing(argument) : Check if an argument was supplied to a function

┣ source("filename.R") : Run R code from a file

┣ setwd("path") : Set the working directory

┗ getwd() : Get the current working directory

┣ args(function) : View the arguments of a function

┣ exists("var") : Check if a variable exists in the current environment

┣ ls() : View the objects in the current environment

┣ rm("var") : Remove a variable from the current environment

┣ source("file.R") : Run code from a file

┗ sink("file.txt") : Redirect output to a file

Graphics
┣ library(ggplot2) : Load the ggplot2 package

┣ ggplot(data, aes(x = x_var, y = y_var)) + geom_point() : Create a scatter


plot

┣ ggplot(data, aes(x = x_var, y = y_var)) + geom_line() : Create a line plot

┣ ggplot(data, aes(x = x_var, y = y_var)) + geom_bar(stat = "identity") :


Create a bar plot

┣ ggplot(data, aes(x = x_var, fill = fill_var)) + geom_bar() : Create a


stacked bar plot

┣ ggplot(data, aes(x = x_var, fill = fill_var)) + geom_histogram() : Create a


histogram

By: Waleed Mousa


R Programming Ultimate Guide
┣ ggplot(data, aes(x = x_var, y = y_var, fill = fill_var)) + geom_boxplot() :
Create a box plot

┣ ggplot(data, aes(x = x_var, y = y_var, fill = fill_var)) + geom_errorbar() :


Create an error bar plot

┣ ggplot(data, aes(x = x_var, y = y_var, fill = fill_var)) + geom_smooth() :


Create a smooth line plot

┣ ggplot(data, aes(x = x_var, y = y_var, fill = fill_var)) +


facet_grid(facets) : Create a grid of plots

┣ ggplot(data, aes(x = x_var, y = y_var, fill = fill_var)) + labs(x = "X axis


label", y = "Y axis label", title = "Plot title") : Add labels and a title to a
plot

┣ ggplot(data, aes(x = x_var, y = y_var, fill = fill_var)) + theme_bw() : Set


the plot theme to black and white

┣ ggplot(data, aes(x = x_var, y = y_var, fill = fill_var)) + theme_classic() :


Set the plot theme to a classic style

┣ ggplot(data, aes(x = x_var, y = y_var, fill = fill_var)) +


scale_x_continuous(limits = c(min_value, max_value), breaks = seq(min_value,
max_value, step)) : Set the limits and tick marks for the x-axis

┣ ggplot(data, aes(x = x_var, y = y_var, fill = fill_var)) +


scale_y_continuous(limits = c(min_value, max_value), breaks = seq(min_value,
max_value, step)) : Set the limits and tick marks for the y-axis

┣ ggplot(data, aes(x = x_var, y = y_var, fill = fill_var)) +


scale_fill_manual(values = c("color1", "color2", ...)) : Set the color scale for
the fill variable

┣ ggplot(data, aes(x = x_var, y = y_var, fill = fill_var)) +


scale_color_manual(values = c("color1", "color2", ...)) : Set the color scale
for the color variable

┗ ggplot(data, aes(x = x_var, y = y_var, fill = fill_var)) +


scale_size_continuous(range = c(min_size, max_size)) : Set the size scale for
the size variable

By: Waleed Mousa


R Programming Ultimate Guide
Statistics
┣ library(stats) : Load the stats package

┣ mean(data) : Calculate the mean of a vector or column in a data

┣ sd(data) : Calculate the standard deviation of a vector or column in a


data frame

┣ var(data) : Calculate the variance of a vector or column in a data frame

┣ cor(data) : Calculate the correlation matrix of a data frame

┣ cov(data) : Calculate the covariance matrix of a data frame

┣ lm(y ~ x, data) : Fit a linear regression model to the data

┣ summary(lm_model) : View a summary of the linear regression model

┣ predict(lm_model, new_data) : Make predictions using the linear regression


model

┣ glm(y ~ x1 + x2 + ..., data, family = "binomial") : Fit a generalized


linear model to the data

┣ summary(glm_model) : View a summary of the generalized linear model

┣ predict(glm_model, new_data, type = "response") : Make predictions using


the generalized linear model

┣ library(lme4) : Load the lme4 package for mixed effects models

┣ lmer(y ~ x1 + x2 + ... + (1|group), data) : Fit a linear mixed effects


model to the data

┣ summary(lmer_model) : View a summary of the linear mixed effects model

┗ predict(lmer_model, new_data) : Make predictions using the linear mixed


effects model

┣ library(forecast) : Load the forecast package for time series analysis

┣ auto.arima(data) : Fit an ARIMA model to the data

┣ ets(data) : Fit an ETS model to the data

By: Waleed Mousa


R Programming Ultimate Guide
┣ stl(data) : Decompose a time series into trend, seasonal, and remainder
components

┣ acf(data) : Calculate the autocorrelation function of a time series

┣ pacf(data) : Calculate the partial autocorrelation function of a time


series

┣ library(nnet) : Load the nnet package for neural networks

Machine Learning
┣ library(caret) : Load the caret package for machine learning

┣ library(randomForest) : Load the randomForest package for random forests

┣ library(gbm) : Load the gbm package for gradient boosting

┣ library(xgboost) : Load the xgboost package for extreme gradient boosting

┣ train(formula, data, method) : Train a machine learning model using cross-


validation

┣ predict(model, new_data) : Make predictions using a machine learning model

┣ confusionMatrix(predictions, actuals) : Calculate the confusion matrix for


a classification model

┣ varImp(model) : Calculate the variable importance for a machine learning


model

┣ partialPlot(model, data, variable) : Create a partial dependence plot for


a machine learning model

┣ library(mlr) : Load the mlr package for machine learning

┣ makeLearner("classif.ksvm") : Create a support vector machine classifier

┣ makeLearner("regr.ranger") : Create a random forest regression model

┣ train(learner, task) : Train a machine learning model using cross-


validation

By: Waleed Mousa


R Programming Ultimate Guide
┣ library(h2o) : Load the h2o package for machine learning

┣ h2o.init() : Initialize the h2o package

┣ h2o.importFile(path) : Import a file into h2o

┣ h2o.glm(y = "response", x = c("predictor1", "predictor2", ...),


training_frame = data) : Fit a generalized linear model in h2o

┣ h2o.randomForest(y = "response", x = c("predictor1", "predictor2", ...),


training_frame = data) : Fit a random forest model in h2o

┗ h2o.deepLearning(y = "response", x = c("predictor1", "predictor2", ...),


training_frame = data) : Fit a deep learning model in h2o

Web Scraping
┣ library(rvest) : Load the rvest package for web scraping

┣ html_text(html_nodes(page, css)) : Extract text from HTML nodes

┣ html_attr(html_nodes(page, css), "attribute") : Extract attributes from


HTML nodes

┣ html_table(html_nodes(page, css)) : Extract tables from HTML nodes

┣ html_form(page) : Extract forms from an HTML page

┣ html_submit(form, "button") : Click a button in an HTML form

┣ library(httr) : Load the httr package for HTTP requests

┣ GET(url) : Send a GET request to a URL

┣ POST(url, body) : Send a POST request to a URL with a request body

┗ library(RSelenium) : Load the RSelenium package for automated web scraping

┗ remDr$findElement(using = "css selector", "selector")$clickElement() :


Click an element on a web page using RSelenium

By: Waleed Mousa


R Programming Ultimate Guide
Text Analysis
┣ library(tm) : Load the tm package for text mining

┣ Corpus(VectorSource(text)) : Create a corpus from a character vector

┣ tm_map(corpus, function) : Apply a function to a corpus

┣ DocumentTermMatrix(corpus) : Create a document-term matrix from a corpus

┣ findAssocs(dtm, term) : Find terms associated with a given term in a


document-term matrix

┣ library(quanteda) : Load the quanteda package for text analysis

┣ corpus(text, docnames = c("doc1", "doc2", ...)) : Create a corpus from a


character vector with document names

┣ tokens(corpus) : Split a corpus into tokens

┣ dfm(tokens) : Create a document-feature matrix from tokens

┣ textstat_frequency(dfm) : Calculate term frequencies from a document-


feature matrix

┣ textstat_collocations(dfm) : Find collocations from a document-feature


matrix

┣ library(quanteda.textmodels) : Load the quanteda.textmodels package for


text classification

┣ textmodel_nb(dfm, y) : Train a naive Bayes model on a document-feature


matrix and target variable

┣ textmodel_svm(dfm, y) : Train a support vector machine model on a


document-feature matrix and target variable

┣ library(wordcloud) : Load the wordcloud package for creating word clouds

┣ wordcloud(words, freq, max.words = 100, colors = brewer.pal(8, "Dark2")) :


Create a word cloud from a vector of words and their frequencies

┣ library(ggplot2) : Load the ggplot2 package for creating visualizations

┗ ggplot(data, aes(x, y)) + geom_*(...) : Create a plot using ggplot2

By: Waleed Mousa


R Programming Ultimate Guide
Time Series Analysis
┣ library(lubridate) : Load the lubridate package for working with dates and
times

┣ as_date(time) : Convert a character vector to a date

┣ year(time) : Extract the year from a date

┣ month(time) : Extract the month from a date

┣ day(time) : Extract the day from a date

┣ hour(time) : Extract the hour from a date-time

┣ minute(time) : Extract the minute from a date-time

┣ second(time) : Extract the second from a date-time

┣ library(zoo) : Load the zoo package for working with time series

┣ as.zoo(data) : Convert a data frame to a zoo object

┣ rollmean(zoo, k) : Calculate the rolling mean of a time series with a


window of k

┣ rollapply(zoo, k, function) : Apply a function to a rolling window of size


k in a time series

┣ library(forecast) : Load the forecast package for time series forecasting

┣ auto.arima(ts) : Automatically fit an ARIMA model to a time series

┣ ets(ts) : Fit an exponential smoothing model to a time series

┣ stl(ts) : Seasonal decomposition of a time series

┗ library(tidyquant) : Load the tidyquant package for financial time series


analysis

┗ tq_get(symbol) : Get financial data for a symbol

┗ tq_transmute(data, mutate_fun, select_fun, fill_fun) : Manipulate financial


data using tidyverse-style syntax.

By: Waleed Mousa


R Programming Ultimate Guide
Other Useful Functions
┣ which.max(data) : Return the index of the maximum value in a vector

┣ which.min(data) : Return the index of the minimum value in a vector

┣ table(data) : Create a frequency table of the values in a vector

┣ summarize(data, variable = function) : Apply a function to a variable in a


data frame

┣ arrange(data, variable) : Sort a data frame by a variable

┣ mutate(data, new_variable = function) : Create a new variable in a data


frame based on an existing variable using a user-defined function or a built-in
function.

┣ select(data, variables) : Select variables from a data frame

┣ filter(data, condition) : Filter rows from a data frame based on a


condition

┣ group_by(data, variable) : Group a data frame by a variable

┣ summarise(data, variable = function) : Apply a function to a variable in a


grouped data frame

┣ ungroup(data) : Remove grouping from a data frame

┣ merge(data1, data2, by = "variable") : Merge two data frames by a variable

┣ library(dplyr) : Load the dplyr package for data manipulation

┣ library(tidyr) : Load the tidyr package for data manipulation

┣ library(ggplot2) : Load the ggplot2 package for graphics

┣ library(stats) : Load the stats package for statistical analysis

┗ library(caret) : Load the caret package for machine learning

By: Waleed Mousa


R Programming Ultimate Guide
Resources
┣ R documentation: https://www.rdocumentation.org/

┣ RStudio cheat sheets: https://www.rstudio.com/resources/cheatsheets/

┣ The R Graphics Cookbook by Winston Chang

┣ R Graphics Cookbook, 2nd edition by Winston Chang

┣ An Introduction to Statistical Learning with Applications in R by Gareth


James, Daniela Witten, Trevor Hastie, and Robert Tibshirani

┣ Machine Learning with R, 2nd edition by Brett Lantz

┗ R for Data Science by Hadley Wickham and Garrett Grolemund

┣ R Programming for Data Science by Hadley Wickham

┣ Advanced R by Hadley Wickham

┣ Text Mining with R by Julia Silge and David Robinson

┣ Applied Time Series Analysis with R by Wayne A. Woodward and Henry L. Gray

┗ Tidyquant: A Tidy Approach to Financial Analysis by Matt Dancho

Useful Packages
┣ dplyr : Data manipulation

┣ tidyr : Data manipulation

┣ ggplot2 : Data visualization

┣ rvest : Web scraping

┣ quanteda : Text analysis

┣ lubridate : Working with dates and times

┣ zoo : Working with time series

┗ forecast : Time series forecasting

By: Waleed Mousa


R Programming Ultimate Guide
Useful Websites
┣ RStudio : https://www.rstudio.com/

┣ CRAN : https://cran.r-project.org/

┣ Stack Overflow : https://stackoverflow.com/questions/tagged/r

┣ R-bloggers : https://www.r-bloggers.com/

┗ DataCamp : https://www.datacamp.com/

┣ R for Data Science: https://r4ds.had.co.nz/

┣ Advanced R: https://adv-r.hadley.nz/

┣ R Graphics Cookbook: https://r-graphics.org/

┣ Machine Learning with R: https://www.tidyverse.org/

┣ Time Series Analysis and Its Applications:


https://www.stat.pitt.edu/stoffer/tsa4/

┣ Text Mining with R: https://www.tidytextmining.com/

┣ Web Scraping with R: https://www.r-bloggers.com/web-scraping-with-r-


tutorial/

┣ R Studio Cheat Sheets: https://www.rstudio.com/resources/cheatsheets/

┗ R Project for Statistical Computing: https://www.r-project.org/

Keyboard Shortcuts
┣ Ctrl+Enter : Run current line or selection

┣ Ctrl+Shift+Enter : Run current block

┣ Ctrl+Alt+I : Insert a new chunk

┣ Ctrl+Alt+L : Reformat current document

┣ Ctrl+Shift+M : Insert a markdown cell

┣ Ctrl+Shift+K : Insert a code cell above

┗ Ctrl+Shift+J : Insert a code cell below

By: Waleed Mousa

You might also like