0% found this document useful (0 votes)
9 views32 pages

Module 2-1

Uploaded by

Kunal Krishna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views32 pages

Module 2-1

Uploaded by

Kunal Krishna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 32

Module 2: R Programming Basics

Topics covered in Module-2

Introduction to R and RStudio


Looping in R
R data types
Processing time computation

Facilitator: Dr Sathiya Narayanan S VIT-Chennai - SENSE Fall Semester 2021-22 1 / 128


Module 2: R Programming Basics

Introduction to R and RStudio

R is is a programming language and free software


environment (available freely under the GNU public license)
for statistical computing.
R was started in 1976 by Bell Laboratories as “S” for Fortran
Library. It was then created and developed by Ross Ihaka and
Robert Gentleman at the University of Auckland, New Zealand, in
1995.
It is managed and maintained by Comprehensive R Archive
Network (CRAN).
It is widely used among statisticians and data miners for
developing statistical software and data analysis.

Facilitator: Dr Sathiya Narayanan S VIT-Chennai - SENSE Fall Semester 2021-22 2 / 128


Module 2: R Programming Basics

Introduction to R and RStudio (contd.)

R and its libraries implement a wide variety of techniques including


linear and nonlinear modeling, classical statistical tests, time-series
analysis, regression, classification, clustering, etc.
It has got a very active community and package contributions
(through user-created packages) and a very little
programming knowledge is required.
RStudio is an Integrated Development Environment (IDE) for R
with advanced and more user-friendly GUI.
R is the substrate on which we can mount various features
using PACKAGES like RCMDR (R Commander) or RStudio.

Facilitator: Dr Sathiya Narayanan S VIT-Chennai - SENSE Fall Semester 2021-22 3 / 128


Module 2: R Programming Basics

Looping in R

for loop:
The for loop consists of the following parts: (i) The keyword for,
followed by parentheses; (ii) An identifier between the parentheses
(say i ); (iii) The keyword in, which follows the identifier; (iv) A
vector with values to loop over; and (v) A code block between
braces that has to be carried out for every value in the object
values.
if-else:.
An if-else statement contains the same elements as an if statement,
and then some extra: (i) The keyword else, placed after the first
code block; and (ii) A second block of code, contained within
braces, that has to be carried out if and only if the result of the
condition in the if () statement is FALSE.
Facilitator: Dr Sathiya Narayanan S VIT-Chennai - SENSE Fall Semester 2021-22 4 / 128
Module 2: R Programming Basics

R data types
R has a wide variety of data types including (but not limited to)
the following. Refer Figure 10.

Basic data types: numeric (double precision), integer, character,


logical and complex.
Vectors: Defined with c() function.
Matrices: Defined with matrix(c(), ...) function.
Lists: It is a list of vectors of usually different types (i.e.
numeric, character, etc.). Defined with list() function.
Data frames: It is a list of vectors of usually different types but of
the same length. Defined with data.frame() function.

Matrices and data frames can be reshaped using cbind and rbind
functions.

Facilitator: Dr Sathiya Narayanan S VIT-Chennai - SENSE Fall Semester 2021-22 5 / 128


Module 2: R Programming Basics

R data types

Figure 10: R Data Types. Source: https://medium.com/@tiwariga


urav2512/r-data-types-847fffb01d5b

Facilitator: Dr Sathiya Narayanan S VIT-Chennai - SENSE Fall Semester 2021-22 6 / 128


Module 2: R Programming Basics

Processing time computation

proc.time() determines how much real and CPU time (in seconds)
the currently running R process has already taken.
It returns five elements for backwards compatibility, but its
print method prints a named vector of length 3.
The first two entries are the total user and system CPU times of the
current R process and any child processes on which it has waited,
and the third entry is the ‘real’ elapsed time since the process was
started. Last two entries are the cumulative sum of user and system
times of any child processes spawned by it on which it has waited.
The ‘user time’ is the CPU time charged for the execution of user
instructions of the calling process. The ‘system time’ is the CPU
time charged for execution by the system on behalf of the calling
process.

Facilitator: Dr Sathiya Narayanan S VIT-Chennai - SENSE Fall Semester 2021-22 7 / 128


• In "proc.time()" function in R, "user time" and "system time" refer to the amount of CPU
time used by the current R process.

• "User time" refers to the amount of CPU time used by the R process to execute code that
was written by the user, such as the R code you write in your scripts.

• "System time" refers to the amount of CPU time used by the R process to execute system-
level operations, such as file I/O or network communication.

• The "proc.time()" function returns a named numeric vector with four elements: "user",
"system", "elapsed", and "child_user". The "user" element corresponds to the amount of
user CPU time used by the R process, and the "system" element corresponds to the
amount of system CPU time used by the R process. The "elapsed" element corresponds to
the total elapsed time since the R process was started, and the "child_user" element
corresponds to the amount of user CPU time used by child processes spawned by the R
process.
Module 2: R Programming Basics

Module-2 Summary

Introduction to R and RStudio


Looping in R: for and if-else
R data types: basic data types (numeric, character, etc),
vectors, matrices and data frames
Processing time computation using proc.time()

Facilitator: Dr Sathiya Narayanan S VIT-Chennai - SENSE Fall Semester 2021-22 9 / 128


Revisitation:
Control Statements in
R

10
Control Structure in R

11
Control Structure in R

12
Control Structure in R
IF

if (condition) {
# do
something
}
else {
# do something
else
}
Example :

x <- 1:15
if (sample(x, 1) <= 10)
{ print("x is less than
10")
}
else {
print("x is greater than
10") 13
Control Structure in R

If else statement:

x<-5
if(x>1){
print("x is greater than 1")
}
else{
print("x is less than 1")
}

14
Control Structure in R
Vectorization with ifelse

ifelse(x <= 10, "x less than 10", "x greater than 10")

Other valid ways of writing


if/else

if (sample(x, 1) < 10) {


y <- 5
} else {
y <- 0
}
y <- if (sample(x, 1) < 10)
{ 5
} else {
0
}
15
Control Structure in R

x=10
if(x>1 & x<7){
print("x is between 1 and 7")
}

else if(x>8 & x< 15)

{ print("x is between 8 and

15")
}

[1] "x is between 8 and 15"

16
Control Structure in R
for
A for loop works on an itterable variable and assigns successive
values till the end of a sequence.

for (i in 1:10) {
print(i)
}
x <-
c("apples",
"oranges",
"bananas",
"strawberries"
)
for (i in x) {
print(x[i])
}
17
Control Structure in R
for

x = c(1,2,3,4,5)
for(i in 1:5){
print(x[i])
}

o/p

[1] 1
[1] 2
[1] 3
[1] 4
[1] 5

18
Control Structure in R
for

for (i in 1:4) {
print(x[i])
}

x <- c(10, 20, 30, 40)

for (i in seq(x)) {
print(x[i])
}

19
Control Structure in R
Nested loops

m <- matrix(1:10, 2)

for (i in seq(nrow(m))) {

for (j in seq(ncol(m))) {

print(m[i, j])

20
Control Structure in R
While

i <- 1
while (i < 10) { print(i)

i <- i + 1
}

Be sure there is a way to exit out of a while


loop.

21
Control Structure in R
Example:

x = 2.987
while(x <= 4.987) {
x = x + 0.987

print(c(x,x-2,x-1))
}

o/p:

[1] 3.974 1.974 2.974


[1] 4.961 2.961 3.961
[1] 5.948 3.948 4.948

22
Control Structure in R
Repeat and break

repeat {
# simulations; generate some value have an expectation if
within some range,
# then exit the loop
if ((value - expectation) <= threshold) {
break
}
}

23
Control Structure in R
Repeat Loop:
The repeat loop is an infinite loop and used in association with a
break statement.

#Below code shows repeat loop:


a=1
repeat {
print(a); a = a+1; if(a > 4)
break
}

o/p:
[1] 1
[1] 2
[1] 3
[1] 4
24
Control Structure in R
Break Statement
A break statement is used in a loop to stop the iterations and flow the
control outside of the loop.

#Below code shows break statement:


x = 1:10
for (i in x){
if (i == 2){
break
}
print(i)
}
[1] 1
25
Control Structure in R
Next

for (i in 1:20) {
if (i%%2 == 1) {
next
} else
{
pr
in
t(
i)
}
This
} loop will only print even numbers and skip over odd numbers.

26
Control Structure in R
Next
Next statement enables to skip the current iteration of a loop
without
terminating it.
for (i in x)
{ if(i ==
2){
Next
}
print(i)
}
o/p
[1] 1
[1] 3
[1] 4 27
Control Structure in R
Switch Statement
 A switch statement permits a variable to be tested in favor of equality
against a list of case values.

 In the switch statement, for each case the variable which is being
switched is checked. This statement is generally used for multiple
selection of condition based statement.

Syntax:

switch (test_expression, case1, case2, case3 .... caseN)

28
Control Structure in R
Switch Statement

i=2
gk<-switch (
i,
"First",
"Second",
"Third",
"Fourth")
print
(gk)

## [1]
"Second"

29
Control Structure in R

30
Control Structure in R
Scan Function
Read data from screen if let the file name "", or just without any parameter:

>x <- scan("",what="int")


1: 43 #input 43 from the screen
2:
Read 1 item
>x

[1] "43"

31
Control Structure in R
>x <-scan("",what="int")
1: 43 #input 43 from the screen
2: 22
3: 67
4:
Read 3 items
>x

[1] "43" "22" "67"

Large data can be scanned in by just copy and paste, for example paste
from EXCEL.

>x <- scan()

Then use "ctrl+v" to paste the data, the data type will be automatically
determined.
32

You might also like