0% found this document useful (0 votes)
8 views

Tutorial 1

Uploaded by

Jessica Kristy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Tutorial 1

Uploaded by

Jessica Kristy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Tutorial 1

Introduction to R
What is R?
• R is a powerful language and environment for statistical computing
and graphics. R is much used in as an educational language and
research tool.
• Free
• A lot of help online;
• Can be understood (hopefully) by people without any programming
experience
• Open-source (Thousands of packages online, and still growing)
• We prefer to use R in combination with the RStudio interface, which
has an organized layout and several extra options.
Install R and Rstudio on your PC

1. Install R

2. Install Rstudio

Or use Posit Cloud in your browser (formerly RStudio Cloud)


https://posit.cloud/

Allows you to access Rstudio right in your browser – no installation


or complex configuration required.
1. Sign up with email address
2. verifying your email
3. Create a New
Rstudio Project
RStudio Layout

• Top left: editor window (also


called script window)
• Bottom left: console window
• Top right: workspace / history
window.
• Bottom right: files / plots /
packages / help window.
Install and Load Packages in R
• install.packages(“XXXX") #install a new package and replace XXXX with
package names
• library(XXXX) #load installed packages and replace XXXX with package
names
Note: Text in red are codes, grey text are replaceable. Text after
the pound sign "#" within the same line is considered a comment
Set work directory
• setwd(“c:/User/your location”) #Or setwd(“~your location”) in Mac#
• Setwd(choose.dir()) #setwd by manually choosing the directory
Basic Expressions: Operators
• +, -, *, /, ^, sqrt
• 1+1
• 2-1
• 3*2
• 6/3
• 3^3
• sqrt(9)
• sqrt((1+7)*8/5-9)
• “hello world” # string variables need quote
• 3 < 4 # some expressions return a “logical value”: TRUE or FALSE
Operators in R
Arithmetic Operators Logical Operators
Operator Description Example Operator Description Example
= (or <-) Assign a value a = 1+2 <, > Less, greater than x<y
+ Add x+y <=, >= Less, greater than or x>=y
equal to
- Subtract x-y
== Equal to x==y
* Multiply x*y
!= Not equal to x!=y
/ Divide x/y
! Not !x
** or ^ Exponentiation x^y or x**y
| Or x|y
%% Modulus x%%y & And x&y
%/% Integer division x%/%y isTRUE() Test if true isTRUE(x==y)
Variables
• x <- 85 #<- assign • Height <- 180
• x/2 # use x in expressions • Weight <- 50
• x <- x*2 + 48 # if you specify x • Height*Weight
again, it will forget what value it
had before
• x <- “hello world” # assign string
values to x
Functions
• You call a function by typing its name, followed by one or more
arguments to that function in the brackets
• sum(1,2,4,7) # summation
• rep(“penny”, times=3) # replicate “penny” for 3 times
Useful Tips
• Case sensitive in R language
• help(rep) #OR ?rep
• always use # to add comments to your code
• some quick start tutorial of R: Quick-R; R Tutor
• stack exchange; r-blogger
Data Types and Structures
Data Types
• Three general types of data
• Strings
• “Why, hi there”
• Numbers
• TRUE/FALSE
• Missing data (NA) – Note, there are no quotes (“NA”)
Data Structures
• Vectors
• Matrices
• Data frames
• Lists

Source: Kabacoff (2011)


Vectors
• Vector is a list of values [numeric, logic, or • Vector Subsetting
string] • y[2] # access the second value of y
• # A vector’s values can be numbers, • y[1:3] #access the first to the third value of y
characters, logical values, as long as they’re • y[c(1,3)] #access the first and the third value of
the same type # y
• Define a vector • y[3] <- “cat” # assign new value to the third
value in y
• x <- c() #empty vector • y[5:6] <- c(“likes” , ”me”) # assign multiple
• x <- c(1,2,4,7) # the c function creates a vector values in continuance
by combining a list of values
• Vector Math
• y <- c(“a” , ”b” , ”c” , “d”) # array of characters • V {+,-,*,/} 1
• y <- c(1,2,”hi”) # what is the data type of this • V+V
vector
• V*V
• typeof(y) • sqrt(V)
• V <- seq(5,10,0.5) • sum(V)
• V <- c(1:10)
Matrices
• A collection of data elements arranged • Matrix subsetting
in a two-dimensional rectangular • m[2,3] # try getting the value of row 2,
layout, data elements should be same column 3
type. • m[2,3] <- 0 # assign new value to a cell in
the matrix
• Define a matrix • m[1,]
• m <- matrix() • m[,2]
• matrix(1,5,5) #create a 5*5 matrix with all • m[,2:3]
values in the matrix equal to 1
• m <- matrix (1:12,nrow=3, ncol=4) # • m[,c(1,3)]
create a 3*4 matrix, with values 1:12 • m[c(1,2),c(1,3)]
• V <- c(1:12) • Matrix Math
• m <- matrix(v,3,4) • m {+,-,*,/} 1
• m+m
• cbind (m,m)
• rbind (m,m)
Data Frames
• # data frame is a data set that • Data frame subsetting
includes multiple types of data, • Data[1,2]
such as numeric and string # • Data$Prices

• Define a data frame • Data frame math


• Weights <- c(1:8) • Data$Weights*Data$Prices
• Prices <- c(2:9) • Data$new<-Data$Weights*Data$Prices
#save the new score as a new variable in
• Types <- c(T,F,F,F,T,F,T,F) the data frame
• Data <- data.frame(Weights, Prices, Types) • mean(Data$Weights)
Lists
• In R, lists act as containers.
• Unlike atomic vectors, the contents of a list are not restricted to a single
mode and can encompass any mixture of data types.
• Define a list
• v1 <- c(1,6,7,8)
• v2 <- c(2,4)
• m <- matrix(1,2,4)
• L <- list (v1, v2, m)
• List Access
• L[[1]]
• Lists are extremely useful inside functions. You can "staple" together lots of
different kinds of results into a single object that a function can return.
Data Management
Manual Data Input
• age <- c(20,25,30)
• gender <- c("male","female","male")
• score <- c(65,75,85)
• newdata <- data.frame(age,gender,score) # create new data frame by function
data.frame
• newdata$midterm <- c(7,8,9) # add a new variable named midterm to the data
frame
• newdata$sum <- newdata$score + newdata$midterm #create a new variable
based on existing variables in the data frame
Import and Export Data
Data Type Import Export

write.table()
TXT, CSV read.table(file=“”) read.csv(file=“”)
write.csv()

library(foreign)
SAV, DTA read.spss(file=“”, to.data.frame = T)
read.dta(file=“”)

Save & load


load(“data.Rdata”) save(data,file=“data.Rdata”),
x.Rdata files
Viewing Data
• View (mtcars)
• names(mtcars) # list the variable names in mtcars
• mtcars $mpg # access certain variable of mtcars
Rename a column in a data frame
• names(mtcars) #check the existing column names
• names(mtcars)[1] <- “brand” #rename by index in names vector
Recode Data
• recode the disp into a new variable rank
• mtcars $rank[mtcars $disp <= 160] <- "L"
• mtcars $rank[mtcars $disp > 160 & mtcars $disp <= 300] <- "M"
• mtcars $rank[mtcars $disp > 300] <- "H"
• mtcars $rank
Subsetting Datasets
• Selecting/keeping variables
• newdata1 <- mtcars[ ,c(1, 3)] #keep the first and third columns
• newdata1 <- mtcars[ ,c(“brand”, ”rank”)] #keep the first and third columns
• Dropping variables
• newdata2 <- mtcars[ ,c(-2 : -5)] #drop the second to the fifth column in the data frame
• Selecting observations
• newdata3 <- mtcars[which(mtcars $rank == “H”),]
• newdata4 <- mtcars[which(mtcars $wt > 4),]
Handling Missing Values
• specify missing values before analysis
• y <- c(1, 2, 3, NA) #NA in capital
• sum(y) #return NA, because there is a missing value in the vector
• sum(y, na.rm=TURE) #na.rm means NA remove equal to TRUE
• help(sum)

You might also like