BT1101 - R Code Cheatsheet 1.0
BT1101 - R Code Cheatsheet 1.0
BT1101 - R code
Data Types
Vectors
Naming a vector
Summing Vectors
Vector selection
Matrix
Selecting elements from a Matrix
Forming Matrix by combining Vectors
Factors
Data Frames
Lists
Referencing elements from a list
Arrays
Dplyr verbs
Count verbs
Group by and summarize
top_n verb
Transmute verb
ggplot2
Data Visualisation
Pie Charts
Bar Charts/ Barplot
Clustered Bar Charts (grouped Barplot)
Plot() function
Creating a side-by-side plot array
R Markdown
Formatting with Markdown
R Chunk Options
Writing code in R markdown
Images in R markdown
Adding external images to R markdown
Data Types
Decimal values are numerics while whole numbers like 4 are called integers. Integers are also
numerics.
x <- 5
y <- 5.5
BT1101 - R code 1
my_character <- "Hello World"
w <- 1+2i
fname = "Harry"
lname = "Potter"
paste(fname, lname)
x <- as.integer(5)
class(x)
Linear Rectangular
Vectors
Vectors are 1D arrays that can store any data types
Within each vector, only data of the same type can be stored
Examples of vectors:
Naming a vector
You can give a name to the elements of a vector with the names() function
BT1101 - R code 2
#print out vector
my_vector
A more efficient way to name vectors is to assign the labels as a vector as well
Summing Vectors
You can perform any arithmetic operations on all numeric vectors
You can sum the total of a numeric vector with the function sum()
Vector selection
To select a specific element in a vector add square brackets to the end of the vector name
#Output will be
Name Age
"kathleen" "19"
BT1101 - R code 3
Certain values can be selected with comparison operators
vector1 > 4
[1] FALSE FALSE FALSE ... ... TRUE
Matrix
We create a matrix using the function matrix()
v1 <- c(1, 2, 3, 4)
v2 <- c(5, 6, 7, 8)
rbind(v1, v2) # combine and form by rows
1 2 3 4 # output
5 6 7 8
cbind(v1, v2) # combine and form by columns
1 5 # output
2 6
3 7
4 8
Factors
BT1101 - R code 4
Data Frames
a list of vectors of equal length used for storing data tables
d <- c(1, 2, 3, 4)
e <- c("red", "yellow", "blue", NA)
)
f <- c(T, T, T, F
mydata <- data.frame(d, e, f)
names(mydata) <- c("ID", "Color", "Passed")
mydata # print the dataframe
#output will be
ID Color Passed
1 red TRUE
2 yellow TRUE
3 blue TRUE
4 <NA> FALSE
Lists
a <- c(1, 2, 3)
b <- c(T, F)
list2 <- list(first = a, second = b) # assign names to the list members
Arrays
It is similar to matrixes, but arrays can have more than 2 dimensions
BT1101 - R code 5
Dplyr verbs
Have to be used with %>%
counties %>%
select(state, county, population, poverty)
select(drive:work_at_home) # another way to select all the columns from drive to work_at_home
counties %>%
arrange(public_work) # arrange your data in ascending order
arrange(desc(public_work)) # arrange your data in descending order
filter() - filter your data such that only those that match the condition will be displayed, can have
more than 1 condition
counties %>%
filter(state == "Texas", population > 10000)
# Only showing Texas and with population more than 10000
counties %>%
mutate(public_workers = population * public_work / 100)
Another function that can be used without %>% is glimpse() . It is used to examine all the variables
in a table
Count verbs
Non-weighted
counties %>%
count(region, sort=TRUE)
Weighted
counties %>%
count(region, wt = citizen, sort = TRUE)
BT1101 - R code 6
Group by and summarize
Summarize functions include:
sum()
mean()
median()
min()
max()
counties %>%
group_by(state) %>%
summarize(total_population = sum(population))
group_by(state) means that we are handling all the states that have the same name collectively
instead of individually, and a new column total_population is created which is the total population
of a state. Output will be:
does similar but instead of only handling state, it now handles state and its
group_by(state, metro)
BT1101 - R code 7
ungroup()
top_n verb
Operates on a grouped function
counties %>%
top_n(1, population)
# '1' specifies the number of most populated county in each state
# that would be displayed
top_n(2, men)
# this would display the top 2 counties with most men in each state
Transmute verb
select() + mutate()
counties %>%
transmute(state, county, fraction_men = men / population)
ggplot2
To create a line graph of the data from your table we use the ggplot() function after importing with
library(ggplot2)
BT1101 - R code 8
# Example
selected_names <- babynames %>%
filter(name %in% c("Steven", "Thomas", "Matthew"))
ggplot(selected_names, aes(x = year, y = number, color = name)) +
geom_line()
Data Visualisation
Pie Charts
Not recommeded
use labels
avoid 3D version
BT1101 - R code 9
#Example but with percentages and colour:
#use paste function to join percentage and label together while also adding "%" to the back
label <- paste(label, piepercent)
label <- paste(label, "&", sep = "")
#by using rainbow() we give 6 (derived from length()) colours to each label
pie(slices, labels = label, col = rainbow(length(label)), main = "Pie Chart of Non-High School Grads")
names.arg - labels , xlab - x-axis, ylab - y-axis, main - name of bar chart, col - colour of bar(s),
cex.names - change size of labels
Plot() function
title(<name of the title>) - adds a title after plotting
log="xy" - scales the x and y axes by log. log="y" - scales only the y axis. log="x" - scales only
the x axis
# to plot scatterplots
plot(<x_data>, <y_data>, xlab = 'x-axis label', ylab = 'y-axis label',
pch = <shape_of_points>, col = 'colour of points')
# use lines() to add more scatter plots in the same xy axis
points(<x_data>, <y_data>)
# to plot line charts (for more info refer to data visualisation slides)
v <- c(7, 12, 4, 5, 10)
t <- c(1, 14, 6, 5, 13)
plot(v, type = "o", col = "red", xlab = "x-axis label", ylab = "y-axis label", main = "name")
# use lines() to add more line charts in the same xy axis
lines(t, type = "o", col = "blue")
BT1101 - R code 10
Creating a side-by-side plot array
par(mfrow = c(<num of row>, <num of col>))
R Markdown
Formatting with Markdown
We use # for making a header. The number of # represents the number of levels the header is
# National University of Singapore <- National Univeristy of Singapore is a first level header
## Hello <- Hello is a second level header
[YouTube](https://youtube.com) -> the word YouTube will become blue -> YouTube
R Chunk Options
R chunk option function
include = TRUE/FALSE Whether to show the R code chunk and its output
echo = TRUE/FALSE Whether to show the R code chunk
message = TRUE/FALSE Whether to show output messages
warning = TRUE/FALSE Whether to show output warnings
eval = TRUE/FALSE Whether to actually evaluate the R code chunk
BT1101 - R code 11
2 + 2 equals `r 2+2`
output will be -> 2 + 2 equals 4
Images in R markdown
R chunk option Possible values effect
fig.height = Numeric, inches the height of the images in inches
fig.width = Numeric, inches the width of the images in inches
fig.align = one of “left”, “right” or “center” the alignment of the image in the report
# Example:

# An impressive mountain! is the caption for the image and is not necessary
# inside the bracket can either be the link to the image or the path to the local file containing the image
BT1101 - R code 12