0% found this document useful (0 votes)
81 views21 pages

Introduction To R

The document provides an introduction to the statistical programming language R. It covers key topics like the R studio integrated development environment, vectors, matrices, data frames, logical vectors, and strings. The document is intended as module 1 of an R analytics workshop.

Uploaded by

Umesh Chaudhary
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
81 views21 pages

Introduction To R

The document provides an introduction to the statistical programming language R. It covers key topics like the R studio integrated development environment, vectors, matrices, data frames, logical vectors, and strings. The document is intended as module 1 of an R analytics workshop.

Uploaded by

Umesh Chaudhary
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Introduction to R

Analytics and R Workshop


R Module - 1

Copyright 2017 : Anish Roychowdhury Jacob Minz


Agenda
• What is R?
• Understanding the R Studio IDE
• Preliminary Data Assignment and Math Operators
• Vectors and Matrices
• Data frames and Lists
• Initialization concepts
• File I/O – Reading and writing CSV data from files
• Module 1 Quiz

Copyright 2017 : Anish Roychowdhury


What is R?
Not just another alphabet!

R is a programming language and software environment for statistical computing and graphics
supported by the R Foundation for Statistical Computing.

History

R is an implementation of the S programming language combined with lexical scoping semantics inspired
by Scheme.[11] S was created by John Chambers while at Bell Labs.

R was created by Ross Ihaka and Robert Gentleman[13] at the University of Auckland, New Zealand, and is
currently developed by the R Development Core Team, of which Chambers is a member. R is named partly after
the first names of the first two R authors and partly as a play on the name of S.[14]

Copyright 2017 : Anish Roychowdhury


Understanding the R studio IDE Variable Information

Editor Window

Documentation results

Command Line

Copyright 2017 Anish Roychowdhury


Copyright 2017 : Anish Roychowdhury
Preliminary Data Assignment and Math Operators
Comment line marker
# multiply
# clear all data variables z = x*y
rm(list=ls()) # to the power
z = x^y
# modulo division remainder > z = x%%y
Assignment operators z = x%%y >z
# basic operations # integer divide [1] 3
x <- 11; y <- 4; z = x%/%y
# add > z = x%/%y
z=x+y >z
# subtract [1] 2

z = x-y

Copyright 2017 : Anish Roychowdhury


Preliminary Data Assignment and Math Operators contd.
# Log and exponentials
vec <- (1:10)
# square root
z = sqrt(4)

# Natural log # factorial


z = factorial(4)
z = log(vec)
# combinatorics ncr
# exponential n=5;r =3
y = exp(z) > num
num = choose(n,r)
[1] 10
# Base 10 log num2 = choose(n,n-r) > num2
z = log(vec, base = 10)
[1] 10

>z
[1] 0.0000000 0.3010300 0.4771213 0.6020600 0.6989700 0.7781513 0.8450980 0.9030900 0.9542425 1.0000000
Copyright 2017 : Anish Roychowdhury
Preliminary Data Assignment and Math Operators contd.
#Rounding Numbers
x = 123.456
# normal rounding 2 decimal
places
z = round(x,digits = 2) > z [1]
123.46
# flooring > z [1]
z = floor(x)
123
# ceiling
z = ceiling(x) > z [1]
124
# truncating decimal part
z = trunc(x) > z [1]
123

Copyright 2017 : Anish Roychowdhury


Vectors
A vector is a sequence of data elements of the same basic type. Members in a vector are officially called
components

# Define a Vector as arbitrary numbers


My_First_Vector <- c(12,4,4,6,9,3)
Note: both are of
same length
# Generating a vector using sequence of numbers with increment
My_Second_Vector = seq(from = 2.5, to = 5.0, by = 0.5)

# linear operation on two vectors


My_Third_Vec = 10* My_First_Vector + 20*My_Second_Vector > My_Third_Vec
[1] 170 100 110 140 180 130

# combining two vectors


First_and_Second <- c(My_First_Vector, My_Second_Vector)

> First_and_Second
[1] 12.0 4.0 4.0 6.0 9.0 3.0 2.5 3.0 3.5 4.0 4.5 5.0
Copyright 2017 : Anish Roychowdhury
More on Vectors
# repeat a vector 3 times
vec3 <- c(0,0,7) > Rvec3
Rvec3 <-rep(vec3,times=3) [1] 0 0 7 0 0 7 0 0 7

# Generating a vector using 'n' numbers equally spaced


vec2 = seq(from = 2.5, to = 7, length.out = 10) > Vec2
[1] 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0

# Repeat individual occurrences of a vector specified number of times


Rvec321 <- rep(c(1,2,3),times = c(3,2,1))
> Rvec321
[1] 1 1 1 2 2 3

# Repeat each occurrence in a vector 'n' times


Rvecn <- rep(c(1,2,3),each=3) > Rvecn
[1] 1 1 1 2 2 2 3 3 3

Copyright 2017 : Anish Roychowdhury


Logical Vectors Player_1 <- c(10,34,54,78,99)
Player_2 <- c(4,24,67,49,100)
# Find out How Player 1 performed vs Player 2
Player_1.success <- Player_1 > Player_2 > Player_1.success
[1] TRUE TRUE FALSE TRUE FALSE

# Which matches did Player 1 win? > Player_1_win


Player_1_win <- which(Player_1.success) [1] 1 2 4

# What did Player 1 score in the matches player 1 won ? > P1_win_scores
P1_win_scores <- Player_1[Player_1_win] [1] 10 34 78

# How many matches did Player 1 win ? > sum(Player_1.success)


sum(Player_1.success) [1] 3

# Did Player 1 win any match ? > any(Player_1.success)


any(Player_1.success) [1] TRUE

# Did Player 1 win all the matches ? > all(Player_1.success)


all(Player_1.success) [1] FALSE
Copyright 2017 Anish Roychowdhury
Strings
# Define a string
x <- "Hello World"

# Get its length > Lenx


lenx = length(x) [1] 1

# How many characters in x ? > ncharx


ncharx = nchar(x) [1] 11

# Define a vector of 2 strings


y <- c("Hello","World")

# get its length > leny


leny = length(y) [1] 2

Copyright 2017 Anish Roychowdhury


Naming strings
# Create a vector month.days month.days
month.days <- c(31,28,31,30,31,30,31,31,30,31,30,31) [1] 31 28 31 30 31 30 31 31 30 31 30 31

# Assign Month short names


mon.shortname <- c("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec")
names(month.days) <- mon.shortname

# print name of the 5th month >names(month.days[5])


names(month.days[5]) [1] "May

# print month names having days = 31 names(month.days[month.days==31])


names(month.days[month.days==31]) [1] "Jan" "Mar" "May" "Jul" "Aug" "Oct" "Dec"

Copyright 2017 Anish Roychowdhury


Matrices
A matrix is a collection of data elements arranged in a two-dimensional rectangular layout.
The following is an example of a matrix with 2 rows and 3 columns.

1 2 3
𝐴= # Extract 2nd row 3rd column
4 5 6 Command continuation
> A23
A23 <- A[2, 3]
[1] 7
A = matrix(
+ c(2, 4, 3, 1, 5, 7), # the data elements
+ nrow=2, # number of rows # Extract 2nd row as a vector
+ ncol=3, # number of columns ARow2Vec <- A[2, ] # the 2nd row
+ byrow = TRUE) # fill matrix by rows > ARow2Vec
[1] 1 5 7
> A # print the matrix
[,1] [,2] [,3] # Extracting a sub matrix
[1,] 2 4 3 A2by2 <- A[1:2,1:2]
[2,] 1 5 7 > A2by2
[,1] [,2]
[1,] 2 4
[2,] 1 5
Copyright 2017 Anish Roychowdhury
Data Frames
A data frame is used for storing data tables. It is a list of vectors of equal length. For
example, the following variable df is a data frame containing three vectors n, s, b.

n <- c(2, 3, 5)
s <- c("aa", "bb", "cc")
b <- c(TRUE, FALSE, TRUE)
df <- data.frame(n, s, b) # df is a data frame

How the data frame would look – Each vector becomes a column in the data frame

n s b df n s b
2 aa TRUE 1 2 aa TRUE
3 bb FALSE 2 3 bb FALSE
5 cc TRUE 3 5 cc TRUE

Copyright 2017 Anish Roychowdhury


Data Frames contd.
Viewing the first 6 rows of a built in data frame “mtcars”

# extract a particular element with row and col names


> mtcars["Mazda RX4", "cyl"]
mtcars["Mazda RX4", "cyl"]
[1] 6

# Get number of Rows information > nrow(mtcars)


nrow(mtcars) [1] 32

# Get number of Columns information > ncol(mtcars)


ncol(mtcars) [1] 11
Copyright 2017 Anish Roychowdhury
Lists
A list is a generic vector containing other objects. In the example shown,
the following variable x is a list containing copies of three vectors n, s, b,
and a numeric value 3

> n = c(2, 3, 5)
> s = c("aa", "bb", "cc", "dd", "ee")
> b = c(TRUE, FALSE, TRUE, FALSE, FALSE)
> x = list(n, s, b, 3) # x contains copies of n, s, b

How the List looks

[[1]] [1] 2 3 5
[[2]] [1] "aa" "bb" "cc" "dd" "ee"
[[3]] [1] TRUE FALSE TRUE FALSE FALSE
[[4]] [1] 3

Copyright 2017 Anish Roychowdhury


Lists contd. Complete List

Extracting a sub list from the a given list [[1]] [1] 2 3 5


[[2]] [1] "aa" "bb" "cc" "dd" "ee"
[[3]] [1] TRUE FALSE TRUE FALSE FALSE
child_list <- x[c(2, 4)] [[4]] [1] 3

[[1]] [1] "aa" "bb" "cc" "dd" "ee"


[[2]] [1] 3
Slicing the list to extract a member
Second_Elem_Slice <- x[2]

[[1]] [1] "aa" "bb" "cc" "dd" "ee"

Directly referencing a member of the list

Sec_Member <- x[[2]]


[1] "aa" "bb" "cc" "dd" "ee"

Directly referencing an item of a member of a list

Sec_Mem_First_Item <- x[[2]][1]


[1] "aa"

Copyright 2017 Anish Roychowdhury


Initialization concepts
Assigning value to a variable

Var1 <- 5

Initialize a numeric vector of length 10


Vec_Size_10 <- vector(mode="numeric", length=10)

Initialize the vector with 5 repeats of '10' and then 5 repeats of '20'

Vec_Size_10 <- rep(c(10,20),each=5)

Create an empty dataframe

df_3col_5row <- as.data.frame(matrix(ncol=3, nrow=5))

# Initialize the first column to 1,s the 2nd col to 2's and the 3rd col to 3's
for (i in 1:5){
df_3col_5row[i,] <- c(1,2,3)
} Copyright 2017 Anish Roychowdhury
List Initialization concepts
Create List column names

mylist.names <- c("COL_1", "COL_2", "COL_3")

Create empty list


mylist <- vector("list", length(mylist.names))

Initialize list with 3 Vectors of different length


mylist <- list(a=1, b=1:2, c=1:3)

Copyright 2017 Anish Roychowdhury


Module 1 - Quiz
All Elements of Data frames are vectors TRUE

All Elements of Lists must have the same length FALSE

Data frames are the most flexible structure in R FALSE

Copyright 2017 Anish Roychowdhury


Thank You

Copyright 2017 Anish Roychowdhury

You might also like