BasicRWorkshop Jan 2016
BasicRWorkshop Jan 2016
January 2016 2
Topics Covered
1. The R Environment
2. The R Language
a) Expressions
b) Workspace objects
c) Function call
3. Data Types and Structures
4. Data Import/Export and Manipulation
5. Plots in R
January 2016 3
RGui: R Console
Command line
RGui: R Editor
• The editor allows you to
type and edit your R code.
• There are 2 ways to run
the code from the editor:
1. Pressing Ctrl+R will run the
line of code that your text
cursor is at.
2. Highlight the line(s) of code
that you want to run and
press Ctrl+R.
3. Highlight the line(s) of
codes that you want to R,
click the right mouse
button, and choose “Run
line or selection”.
January 2016 6
RGui: R Editor
1. Start a new script (Ctrl-N).
#Filename: MyFirstR.R
#Purpose: Basic R Workshop Exercises
January 2016 7
RGui: R Editor
Exercise:
Type the following lines in your R editor and run the lines.
Observe what happens in the R console.
#This is a comment
a = 1+2+3
a
January 2016 8
setwd("C:/Users/gloriateng/Dropbox/RWorkshop")
getwd()
*Note that the backslash (/) is used instead of the forward slash (\).
2. Point-and-Click
a. Click on any empty spot of your R console.
b. Go to “File” – “Change dir”.
c. Choose the directory path where your working
directory is.
January 2016 10
Recap
• RGui: a software that allows you to edit and run R
codes.
• R console: allows you to type and run the code from the
command line
• R editor: allows you to type, edit, and run code. All code
written in the R editor will be saved as a .R file.
Arithmetic Keywords
Functions Variables
Expressions
Type the expressions given (without the comments) in your editor and run your
code.
Expressions
Type the expressions given (without the comments) in your editor and run your
code.
a
b
Expressions: Conditions
== Equality
!= Inequality
|| Logical or
! Logical not
January 2016 20
Expressions: Keywords
These symbols normally appear when R is unable to
evaluate an expression, for example, or when a
dataset has missing values.
Workspace Objects
• When a value is assigned to a symbol/variable, it is
remembered as an object in R’s memory.
• Recall that you currently have two variables, namely “a”
and “b” in memory.
• To find out what are the objects currently in your
workspace, type ls().
• To clear all the objects in your workspace, type
rm(list = ls(all=TRUE)).
• Now type ls()again. You have now removed all the
objects in your workspace.
• To clear the console, using Ctrl-L will do the magic.
January 2016 22
•rm(list =
ls(all=TRUE))
January 2016 24
Getting Help
• help(functionname)
• An example: help(sd)
• And don’t forget, Google is always one’s best friend!
January 2016 25
Recap
• Expressions: constant values, arithmetic,
conditions, assigning values to
symbols/variables, keywords, functions
• Use ls() to call the variables in your current
workspace.
• Use rm(list = ls(all=TRUE)) to remove
all variables in your current workspace.
• Ctrl-L clears the console.
• Learning to read the R documentation properly
helps you to understand a function better.
January 2016 26
•If the code is not running, stay calm, and check your
code. Most of the time it could be a spelling mistake
(especially big/small letters), a forgotten parenthesis, or
wrong usage of the function argument.
• Data structures:
• Vectors
• Factors
• Matrices
• Data frames
• Lists
January 2016 28
1 A
2 B
3 C
A numeric vector. A character vector.
January 2016 29
counts = c(2, 4, 6, 2, 8)
counts
F 1
M 0
Gender = {M, F} Gender = {0, 1}
January 2016 31
1. Create a data frame for the variables counts, fruits, gender, and
genderF.
2. Observe what happens when you run the following commands given:
• myframe
• dimnames(myframe)
• dim(myframe)
• nrow(myframe)
• ncol(myframe)
• names(myframe)
• myframe[1,1]
• myframe[1,]
• myframe[1]
January 2016 34
> myframe
counts fruits gender genderF
1 2 Apple M 0
2 4 Orange F 1
3 6 Grape M 0
4 2 Apple F 1
5 8 Apple M 0
> dimnames(myframe)
[[1]] Component Index
[1] "1" "2" "3" "4" "5"
[[2]]
[1] "counts" "fruits" "gender" "genderF”
January 2016 35
mylist = list(rownames=rownames(myframe),
colnames=colnames(myframe))
mylist
January 2016 36
1 5 9
2 6 10
3 7 11
4 8 12
A 4 x 3 matrix.
January 2016 37
matrix(1:6, ncol=3)
matrix(1:6, nrow = 2, ncol=3)
Observe the results from these two lines of code. Are they same of
different?
array(1:8, dim=c(2,2,2))
January 2016 39
Recap
•Vectors
•Factors
•Data frames
•Lists
• Data manipulation
• Extracting rows/columns from datasets
• The “apply” function
January 2016 41
Data Manipulation
Exercise:
1. Extract the column “speed”.
attach(mycar)
speed #Observe what happens here.
January 2016 43
Data Manipulation
Exercise:
2. Find the sum, mean, and variance for the variables speed and
dist.
apply(mycar, 2, sum)
apply(mycar, 2, mean)
apply(mycar, 2, var)
Recap
• read.table(), write.table()
• Data manipulation
• Extracting rows/columns from datasets: using
the $ sign or attaching the object
• The “apply” function
January 2016 45
Session 5: Plots in R
Exercise 1:
1. Read the documentation for the function plot (run help(plot)).
2. Type and run the following lines. Observe the difference in each line and what
happens when you run the code.
plot(mycar)
plot(dist, speed)
Session 5: Plots in R
Exercise 1 (continued):
abline(h = 10)
abline(v=50)
Session 5: Plots in R
Exercise 2 (extra stuff in video, please watch it!)
1. Generate 100 random variables from the standard normal distribution.
2. Plot a histogram with the estimated density on the left, and the empirical
cumulative distribution function on the right.
n = 100 #sample size
x = rnorm(n) #generate n number of random variables from the standard normal
distribution
par(mfrow = c(1,2)) #divide the graphic window into 1 row and 2 columns
#draw a histogram (if probability = TRUE, the density would be drawn, otherwise
the count of observations will be given)
hist(x, xlab = "x", ylab = "Density", main = "",
col = "grey", border = "white", probability = "TRUE")
#add a legend
legend("topleft", c("Histogram", "Normal density"), cex=0.8,
col = c("grey", "black"), lwd = c(NA,1),
lty = c(NA,2), pch = c(15, NA), bty = "n")
January 2016 48
Session 5: Plots in R
Exercise 2 (continued):
#the function locator(1) allows you to choose the location of the legend
#using a mouse click
legend(locator(1), c("Empirical c.d.f", "Normal c.d.f."), cex=0.8,
col = c("grey", "black"), lwd = c(2,1),
lty = c(1,2), bty = "n")
January 2016 49
•Saving images
Moving on
1. Statistics in R
2. Linear regression and time series (Applied Statistical
Models)
3. for, while loops, creating your own functions
4. S4 objects
5. Read the documentation
6. Google with discernment
7. Practice, practice, practice, practice!
January 2016 52
References
1. An interactive online learning tool: http://tryr.codeschool.com/
3. R Tutorial: http://www.r-tutor.com/
4. R-bloggers: http://www.r-bloggers.com/
5. Quick-R: http://www.statmethods.net/
Image references:
(pg. 23)
http://www.alabamarivers.org/press-room/media-relations/action-alert-protect-impaired-streams
(pg. 35)
https://courses.cs.washington.edu/courses/csep573/01sp/lectures/class1/sld038.htm
(Pg. 46)
http://metamorphicliving.files.wordpress.com/2012/09/footsteps.jpg