R-Programming-Cheat-Sheet
R-Programming-Cheat-Sheet
This cheat sheet provides a quick reference for essential R programming Basics
Statistics
commands, helping you perform data manipulation, visualization, and install.packages, library,
mean, median, sd, cor, lm
assignment (<-), print, class
statistical analysis with confidence. It covers foundational topics like installing
packages and understanding R's data structures, alongside advanced tasks
Data Structures
such as building models and applying machine learning techniques.
Programming
c, list, matrix, data.frame,
if, for, while, function, apply
df$a or df
Each section includes concise syntax and practical examples to illustrate how R
commands are used in real-world scenarios. You'll find guidance on working Data Manipulation Machine Learning
with vectors, lists, matrices, and data frames, performing common data filter, select, mutate, Matrices, Linear Model, Visualize
Designed for clarity and accessibility, this resource is ideal for data analysts,
statisticians, and programmers seeking to enhance their workflows in R.
Whether you're exploring data, developing algorithms, or building
reproducible reports, this cheat sheet ensures you can quickly apply R's
powerful tools to your projects.
R Cheat Sheet
Basics Data Structures
Syntax for How to use Explained Syntax for How to use Explained
Install
install.packages("dplyr") Installs the dplyr package. Create Vector c(1, 2, 3) Combines elements into a vector.
Package
Load Package library(dplyr) Loads the dplyr package into the Create List list(a=1, b="two") Creates a list with named elements.
current R session.
Create Data
Creates a data frame with columns a
Print Output print(x) Prints the value of x to the console. Frame
data.frame(a=1:3, b=4:6)
and b .
Examples of logical, integer, numeric, Access df$a | df[1, 1] Performs a logical OR operation between
125, 12.5, "Hello"
Literals and
TRUE, Element
Data Types and character literals in R. a column and a specific element.
R Cheat Sheet
Data Manipulation Data Visualization
Syntax for How to use Explained Syntax for How to use Explained
Select
facet_wrap(~variable_3)
Mutate Adds a new column c as sum of a
mutate(df, c = a + b)
Columns and b .
v")
scales::comma)
R Cheat Sheet
Data Visualization Statistics & Probability
Syntax for How to use Explained Syntax for How to use Explained
P-Value
if (p_value < 0.05) { print('Reject Decide on hypothesis rejection using
Grouped Bar Creates a grouped bar plot to compare Decision
ggplot(data = df, aes(x = col_1, null hypothesis') } else { a common p-value threshold of 0.05.
Plot frequency distributions of categorical Threshold
fill = col_2)) + geom_bar(position print('Fail to reject null
variables. hypothesis') }
= "dodge")
R Cheat Sheet
Statistics & Probability
Syntax for How to use Explained Syntax for How to use Explained
Chi-Squared Calculates the cumulative probability Simulate Simulates a random coin toss using
pchisq(3.84, df = 1) set.seed(1)
Distribution for a chi-squared distribution with Coin Toss R's uniform random numbers.
coin_toss <- function() { if
specific degrees of freedom.
(runif(1) <= 0.5) 'HEADS' else
Chi-Squared Calculate cumulative probability for a 'TAILS' }
pchisq(q = 10, df = 5)
Test chi-squared statistic of 10 with 5
degrees of freedom. Addition Formula to calculate probabilities of
P(A ∪ B) = P(A) + P(B) − P(A ∩ B)
Rule for unions of events, adjusting for
Multi-category
data <- table(income$sex,
Performs a chi-squared test on the Probability overlap in non-exclusive cases.
Chi-squared given contingency table.
income$high_income)
Test
Independ Probability of independent events
P(A ∩ B) = P(A) * P(B)
Defines a function to calculate the ent Events occurs as product of individual
Computing compute_mode <- function(vector)
mode of a given vector using dplyr probabilities.
Mode in R {counts_df <- tibble(vector) %>%
functions.
group_by(vector) %>%
Product Calculate the total outcomes for two
summarise(frequency=n()) %>% total_outcomes <- a * b
Rule in independent experiments using the
arrange(desc(frequency)); Experiments product rule.
counts_df$vector[1]}
Calculate Z- This calculates the Z-score for a value Uniform # Assuming all outcomes have equal Demonstrates a uniform distribution
z_score <- function(value, vector)
relative to a vector's distribution. Distribution chance
for a dice roll, where outcomes
score { (value - mean(vector)) /
outcomes <- c(1, 2, 3, 4, 5, 6)
equally likely.
sd(vector) }
probabilities <- rep(1/6, 6)
paste('Outcome:', outcomes,
Chi-Squared Calculates the cumulative probability 'Probability:', probabilities)
pchisq(3.84, df = 1)
Distribution for a chi-squared distribution with
specific degrees of freedom.
R Cheat Sheet
Statistics & Probability Programming
Syntax for How to use Explained Syntax for How to use Explained
Probability cardinalities.
B)) / length(B)
While Loop while (x < 5) x <- x + 1
Repeats code while the x < 5
condition is true.
Conditional Conditional probabilities are
P_A_given_B <- 1 - P_Ac_given_B Syntax for Defines a reusable function structure
Probability interrelated P(A|B) and its
; function_name <- function(input) {
functions in R.
Definition complement P(Ac|B) can be # Code to manipulate the input
}
Defines independent events joint
Independence P_A_and_B <- P_A P_B
:
*
R Cheat Sheet
Machine Learning File I/O
Syntax for How to use Explained Syntax for How to use Explained
Fitting a Fit a linear regression model with a Read CSV read.csv("file.csv") Reads a CSV file into a data frame.
lm_fit <- lm(response ~ predictor,
Linear response and a predictor variable.
data = df)
Model
Write CSV write.csv(df, "file.csv") Writes a data frame to a CSV file.
Visualize library(ggplot2)
Visualize the distribution of residuals
Residuals ggplot(data.frame(residuals = to check the linear model's fit.
lm_fit$residuals), aes(x = Read RDS readRDS("file.rds") Reads an RDS file into R.
residuals)) + geom_histogram()
plot(knn_model)
R Cheat Sheet