R Programming Checklist of Basic Skills With Examples
R Programming Checklist of Basic Skills With Examples
By
Aliyu Sambo
1/25/2021
Table of Contents
Introduction ............................................................................................................................................................ 3
Mathematical Operators and Functions ....................................................................................................... 3
Logical Operators .................................................................................................................................................. 4
Working with Sequences.................................................................................................................................... 4
Working with Arrays and Matrices ................................................................................................................ 5
Add or Delete Elements in Vectors and Matrices ................................................................................ 8
Reading Data Sets in R ........................................................................................................................................ 9
Configuring the R Workspace ...................................................................................................................... 9
Working with data sets that are available in R .................................................................................. 10
Read TXT files with read.table() .............................................................................................................. 11
Read CSV Excel Files into R ....................................................................................................................... 11
Built-in Functions .............................................................................................................................................. 13
Some Useful Built-in Functions for Vectors: ....................................................................................... 13
Some Useful Built-in Functions for Matrices: ..................................................................................... 14
Data Frame ........................................................................................................................................................... 15
Creating A Data Frame from Vectors ..................................................................................................... 15
Changing class of the object ...................................................................................................................... 16
Accessing data from Data Frame ............................................................................................................. 17
Data Subsetting .............................................................................................................................................. 17
Control Structures ............................................................................................................................................. 21
if and else .......................................................................................................................................................... 22
for loops ............................................................................................................................................................ 23
Nested for loops ............................................................................................................................................. 24
while ................................................................................................................................................................... 25
repeat loops and break ............................................................................................................................... 25
loops and next commands.......................................................................................................................... 25
Functions ............................................................................................................................................................... 26
return in functions ........................................................................................................................................ 27
Named Parameters and Default Parameters ...................................................................................... 28
The plot() function ............................................................................................................................................ 30
Working with RMarkdown ............................................................................................................................. 33
Introduction
This document highlights some of the basic and core skills that may be useful for the MSC
BDA students; examples are provided where possible.
The skills that this document details are not meant to be a comprehensive set of skills nor a
mandatory set of skills. Rather, they indicate a reasonable starting point.
## [1] 10
5*5 # Multiplication
## [1] 25
55/5 # Division
## [1] 11
5^2 # Square
## [1] 25
# Functions:
log(5)#log
## [1] 1.609438
log10(1000)#log base 10
## [1] 3
exp(5) # exponential
## [1] 148.4132
## [1] 25
a = 3
b = a^2
print(b)#used to print your variable
## [1] 9
Logical Operators
Usage examples:
1) to check whether a condition is True (T) or False (F).
2) to subset a data use specified criteria
3) to control the flow of a program e.g. in loops, functions, etc.
Symbols:
> for ‘greater than’, < for ‘less than’, >= for ‘greater than or equals’, <= for ‘less than or
equals’, = = for ‘equality’, ~ = for ‘inequality’, | for ‘Or’, & for ‘And’
Examples:
3>6 # 3 greater than 5?
## [1] FALSE
## [1] FALSE
## [1] TRUE
## [1] 1 2 3 4 5 6 7 8 9 10 11 12
a=seq(5,12)
## [1] 1 3 5 7 1 3 5 7 1 3 5 7
## [1] 1 1 1 3 3 3 5 5 5 7 7 7
# To extract the rth entry of the vector use v[r] where r is any number
v[3]
## [1] 0.79
## [1] 4
# The logical operators are used to extract elements from array based on some
criteria. e.g
v=c(1.3,3.3,.77,10,25)
v[v>2]
## [1] 25
v[v==1.2 | v<1]
## [1] 0.77
#By using c (combine function), you can create an array that contains only ch
aracters.
daga=c("dada","yaya","2020","bibi")
daga
## [1] "yaya"
## [,1] [,2]
## [1,] 1 4
## [2,] 2 5
## [3,] 3 6
#Or
matx2=matrix(c(1,2,3,4,5,6),3,2)#specify (X by Y) arrangement i.e. 3 rows by
2 columns in this example
matx2
## [,1] [,2]
## [1,] 1 4
## [2,] 2 5
## [3,] 3 6
## [1] 2
## day night
## Jan 1 4
## Feb 2 5
## Mar 3 6
matx["Jan","day"]
## [1] 1
#You can find out the dimension of any matrix with dim command.
matx
## day night
## Jan 1 4
## Feb 2 5
## Mar 3 6
## [1] 3 2
## [1] 3
## [1] 2
## [,1] [,2]
## [1,] 10 3
## [2,] 0 -2
## [3,] -6 1
## [,1] [,2]
## [1,] 2 -1
## [2,] 0 -4
## [3,] 4 3
## [,1] [,2]
## [1,] 24 2
## [2,] 0 -3
## [3,] 5 -2
C<-matrix(c(2,4,5,6),nrow=2)
C
## [,1] [,2]
## [1,] 2 5
## [2,] 4 6
## [,1] [,2]
## [1,] 16 36
## [2,] -12 -18
## [3,] 6 7
## [,1] [,2]
## [1,] -0.75 0.625
## [2,] 0.50 -0.250
#The rbind() and cbind() functions enable one to add rows or columns to a mat
rix.
one=c(1,1,1,1)
z=matrix(c(1,2,3,4,1,1,0,0,1,0,1,0),ncol=3)
## one
## [1,] 1 1 1 1
## [2,] 1 2 1 0
## [3,] 1 3 0 1
## [4,] 1 4 0 0
z[-c(1,2),]# deleting first and second row. To delete more than one row,
## [,1] [,2]
## [1,] 1 1
## [2,] 1 0
## [3,] 0 1
## [4,] 0 0
## [1] 1 0 1 0
• Avoid blank spaces when naming fields and values ( spaces sometimes indicate
separation) to avoid errors;
• avoid names that contain symbols such as ?, $,%, ˆ, &, *, (, ),-,#, ?„,<,>, /, |, , [ ,] ,{, and };
## ht wt
## 1 58 115
## 2 59 117
## 3 60 120
## 4 61 123
## 5 62 126
## 6 63 129
## 7 64 132
## 8 65 135
## 9 66 139
## 10 67 142
## V1 V2
## 1 58 115
## 2 59 117
## 3 60 120
## 4 61 123
## 5 62 126
## 6 63 129
## 7 64 132
## 8 65 135
## 9 66 139
## 10 67 142
diadata=read.csv("diabetes1.csv",header=F)
head(diadata)
## V1 V2 V3 V4 V5 V6 V7 V8 V9
## 1 6 148 72 35 0 33.6 0.627 50 1
## 2 1 85 66 29 0 26.6 0.351 31 0
## 3 8 183 64 0 0 23.3 0.672 32 1
## 4 1 89 66 23 94 28.1 0.167 21 0
## 5 0 137 40 35 168 43.1 2.288 33 1
## 6 5 116 74 0 0 25.6 0.201 30 0
For Delimited Files (data is organized in a data matrix) you can use read.delim(). The
delimiter can be specified:
sep="\t" for tab-delimited
## drug math ed
## 1 2.17 7.9` 2
## 2 2.97 5.20 1
## 3 3.26 6.47 2
## 4 2.69 3.07 3
## 5 3.83 4.15 4
## 6 2.00 2.02 2
Built-in Functions
To make development easy, many useful functions are built-in.
## [1] 8
## [1] 9
## [1] 0
## [1] 2
which.max(x)# to get the location of the element with the maximum value
## [1] 3
## [1] 0 9
## [1] 38
## [1] 1 4 13 18 27 27 32 38
## [1] 4.75
## [1] 5
var(x)# variance of vector
## [1] 11.07143
## [1] 3.327376
## [1] 0 1 3 5 5 6 9 9
## [1] 9 9 6 5 5 3 1 0
## [1] 2 6 -4 4 -9 5 1
xy
## [1] 5 8
## [1] 9
## [1] 9
## [1] 9
## [1] 1
## [1] 1 9
## [1] 2 6 8 11 17 18 21 27 36
## [1] 4
## [1] 3
## [1] 2.54951
## [1] 1 2 2 3 3 4 6 6 9
## [1] 9 6 6 4 3 3 2 2 1
Data Frame
A data frame in R combines features of vectors, matrices, and lists. Like vectors, data
frames must have the same kind of data in each column. Like matrices, data frames have
both rows and columns. Like lists, data frames allow the user to have a combination of
numeric, character, and logical data.
A data frame may be likened to a worksheet in Excel (or some other spreadsheet program)
or a statistics program like SPSS or JMP.
gender<-c("m","m","m","f","f","m","f","f","f","m")
scores <-c(17,19,16,15,23,17,24,29,24,25)
We can obtain individual columns by using the column index in square brackets. We can
also employ the data frame name followed by a $ sign and the column name.
quiz_scores[2] # get column 2 i.e. gender
## gender
## 1 m
## 2 m
## 3 m
## 4 f
## 5 f
## 6 m
## 7 f
## 8 f
## 9 f
## 10 m
## [1] 17 19 16 15 23 17 24 29 24 25
## [1] "integer"
## [1] "data.frame"
xy1=as.matrix(xy) # create a matrix with xy
class(xy1)
quiz_scores$scores>15
## [1] TRUE TRUE TRUE FALSE TRUE TRUE TRUE TRUE TRUE TRUE
## [1] 17 19 16 23 17 24 29 24 25
## [1] 15 23 24 29 24
Data Subsetting
Data subsetting is an important part of the data analysis.
data=read.table("data3.txt",header=T) # load data with demographic informatio
n
head(data)
## [1] "data.frame"
## [1] 27 4
# SUBSET BY ROWS
# use a sequence to specify a subset
data[1:5,]
# SUBSET BY COLUMNS
# Specify the column to exclude
data[,-2] # exclude column 2
## Qtr1 Qtr4
## 1960 160.1 120.1
## 1961 160.1 116.9
## 1962 169.7 123.3
## 1963 187.3 120.1
## 1964 176.1 123.3
## 1965 185.7 131.3
## 1966 200.1 136.1
## 1967 204.9 140.9
## 1968 227.3 142.5
## 1969 244.9 153.7
## 1970 244.9 142.5
## 1971 301.0 267.3
## 1972 317.0 336.2
## 1973 371.4 355.4
## 1974 449.9 403.4
## 1975 491.5 409.8
## 1976 593.9 483.5
## 1977 584.3 485.1
## 1978 669.2 509.1
## 1979 827.7 542.7
## 1980 840.5 670.8
## 1981 848.5 701.2
## 1982 925.3 683.6
## 1983 917.3 694.8
## 1984 989.4 730.0
## 1985 1087.0 787.6
## 1986 1163.9 782.8
data[data[,1]>=500,]# you can reference the column by its number Qtr1 is colu
mn 1
Control Structures
Controlling the flow of your program based on some ‘logic’ is important in programming,
examples include:
• if and else: can be used to control the logic flow and act based on meeting a
condition
• for: execute a loop for a fixed number of times
• while: execute a loop while a condition is true
• repeat: execute an infinite loop until it a break or stop command is issued
• break: break the execution of a loop
• next: skip an iteration in a loop
if and else
In this structure, you can test a condition and execute an action based on whether it’s true
or false.
The if and else commands can be combined in different ways:
(a) Take action based if a single condition is met
Example:
# This example simply check to see if a number is a positive number
a = 0.9
if (a>0){print('found a positive number')}# prints because a is greater than
0
b = -0.9
if (b>0){print('found a positive number')}# does nothing because b is not gre
ater than 0
(b) Take action if a condition is met or take another specified action if the condition is met
if (condition){ #do this action if condition is true
}
else{ #do this action
}
# returns 'yes' when positive and 'no' otherwise
x<- c(0.8, -0.9, -0.6, 0.9, -0.9, 0.8)
(c) Take an action if a condition is met and a different action if another condition is met.
Then, if non of the conditions is met take a specified action is taken.
if (condition){
#do something if condition is true
} else if (condition2) {
#do someting if condition2 is true
} else {
#do something if neither condition 1 nor condition 2 is true
}
for loops
Loops can be used to repeat actions, such as iterating over the elements of an object.
# Loop through a sequence
for (i in 10:20) {
print(i)
}
## [1] 10
## [1] 11
## [1] 12
## [1] 13
## [1] 14
## [1] 15
## [1] 16
## [1] 17
## [1] 18
## [1] 19
## [1] 20
## [1] "item1"
## [1] "item2"
## [1] "item3"
## [1] "item4"
## [1] "item5"
# use the length of the vector to determine how many times to loop
for (i in 1:length(x)) {
print(i)
}
## [1] 1
## [1] 2
## [1] 3
## [1] 4
## [1] 5
for (i in 1:length(v))
if(v[i]>0){
print("'found a positive number'")
} else if (v[i]<0){
print("'found a negative number")
} else{
print("equals to zero")
}
# Loop by row and then column using the 'nrow' and 'ncol' functions
for (i in 1:nrow(m)) {
for (j in 1:ncol(m)) {
print(m[i, j]) # print the items
}
}
## [1] 1
## [1] 4
## [1] 7
## [1] 10
## [1] 2
## [1] 5
## [1] 8
## [1] 11
## [1] 3
## [1] 6
## [1] 9
## [1] 12
## [1] 1
## [1] 2
## [1] 3
## [1] 4
## [1] 5
## [1] 6
## [1] 7
## [1] 8
## [1] 9
## [1] 10
## [1] 1
## [1] 2
## [1] 3
## [1] 4
## [1] 5
## [1] 11
## [1] 12
## [1] 13
## [1] 14
## [1] 15
## [1] 16
## [1] 17
## [1] 18
## [1] 19
## [1] 20
## [1] 1
## [1] 2
## [1] 3
## [1] 4
## [1] 5
## [1] 6
## [1] 7
## [1] 8
## [1] 9
## [1] 10
## [1] 11
Functions
The function keyword is used to create new functions
# this function computes the square of numbers
square<-function(x){
x^2
}
square(2)
## [1] 4
square(-2)
## [1] 4
squares.sum(2,3)
## [1] 13
x<-c(5,6,3,9,6)
rescale(x)
Besides the built functions, user defined function and control structures can be used when
defining a new function.
# Define a function that returns the square of a number, 'square'
square<-function(x){
x^2
}
equation (2)
## [1] 15
return in functions
For functions that produce an output, how do decide what to returns as output?
A common practice is to return a value by just making it the last expression in a function.
Alternatively, you can explicitly return a value from a function before its last expression by
using the ‘return’ function. The ‘return’ function is usually used to exit a function early.
Examples:
x<-c(5,6,3,9,6)
rescale(x)
rescale(x)
rescale(x)
# TESTING -------------------------
# Values to be converted
temp<-c(15,26,35,37,26,4,-2)
plot(x, y)
Here is a more
concrete example where we plot a sine function form range -pi to pi.
# In this example, plot() receives a vector of values as x and calculates the
y values using the sin() function.
)
# Overlaying plots in R Using legend() function
Getting Help
It is important to learn to use help on commands and packages. The extensive help and
documentation of R and R packages can be accessed using commands on the console or the
GUI of R studio.