UNIT-I_r_programming-1
UNIT-I_r_programming-1
Features of R programming
1. It is a simple and effective programming language which has been well
developed.
2. It is a well-designed, easy, and effective language which has the concepts
of user-defined, looping, conditional, and various I/O facilities.
3. For different types of calculation on arrays, lists and vectors, R contains a
suite of operators.
4. It provides effective data handling and storage facility.
5. It is an open-source, powerful, and highly extensible software.
6. It provides highly extensible graphical techniques.
7. R is an interpreted language.
Pros
1) Open Source
2) Platform Independent
R simplifies quality plotting and graphing. R libraries such as ggplot2 and plotly
advocates for visually appealing and aesthetic graphs which set R apart from
other programming languages.
5) Statistics
Cons
1) Data Handling
In R, objects are stored in physical memory. It requires the entire data in one
single place which is in the memory. It is not an ideal option when we deal with
Big Data.
2) Basic Security
3) Lesser Speed
4) Complicated Language
R is a very complicated language, and it has a steep learning curve. The people
who don't have prior knowledge or programming experience may find it difficult
to learn R.
The operating system allocates memory based on the data type of the variable and
decides what can be stored in the reserved memory.
There are the following data types which are used in R programming:
Logical True, False It is a special data type for data with only two possible
values which can be construed as true/false.
Integer 3L, 66L, 2346L Here, L tells R to store the value as an integer,
Variables in R Programming
Variables are used to store the information to be manipulated and referenced in
the R program.
Using equal operator- operators use an arrow or an equal sign to assign values to
variables.
R Variables Syntax
variable_name = value
variable_name value
Example:
var1 "hello"
print(var1)
Output: hello
2) Apart from the dot and underscore operators, no other special character is
allowed. Example: var$1 or var#1 both are invalid
3) Variables can start with alphabets or dot characters. Example: .var or var
is valid
4) The variable should not start with numbers or underscore. Example: 2var
or _var is invalid.
5) If a variable starts with a dot the next thing after the dot cannot be a number.
Example: .3var is invalid
6) The variable name should not be a reserved keyword in R. Example:
TRUE, FALSE,etc.
class() function
This built-in function is used to determine the data type of the variable provided
to it. The R variable to be checked is passed to this as an argument and it prints
the data type in return.
Syntax:
class(variable)
Example:
var1 = "hello"
print(class(var1))
Output: “character”
Operators in R
An operator is a symbol which tells the compiler to perform
specific logical or mathematical manipulations. In R programming, there are
different types of operators, and each operator performs a different task.
Arithmetic Operators
Arithmetic operators are the symbols which are used to represent arithmetic math
operations. There are various arithmetic operators which are supported by R.
Relational Operators
A relational operator compares each element of the first vector with the
corresponding element of the second vector. The result of the comparison will be
a Boolean value. There are the following relational operators which are supported
by R:
Logical Operators
The logical operators allow a program to make a decision on the basis of multiple
conditions. In the program, each operand is considered as a condition which can
be evaluated to a false or true value. The logical operator compares each element
of the first vector with the corresponding element of the second vector.
4. && This operator takes the first a <- c(3, 0, TRUE, 2+2i)
element of both the vector and b <- c(2, 4, TRUE, 2+3i)
gives TRUE as a result, only if print(a&&b)
both are TRUE.
It will give us the following
output:
[1] TRUE
Assignment Operators
Miscellaneous Operators
Miscellaneous operators are used for a special and specific purpose. These
operators are not used for general mathematical or logical computation. There are
the following miscellaneous operators which are supported in R
Note: There is only one difference between atomic vectors and lists. In an
atomic vector, all the elements are of the same type, but in the list, the elements
are of different data types.
Print(Myvec)
Output: 1 3 1 4 2
We can create a vector with the help of the colon operator. There is the following
syntax to use colon operator:
Z x:y
Example: b 1:10
Print (b)
Output: 1 2 3 4 5 6 7 8 9 10
Example: seq_vec<-seq(1,4,by=0.5)
Print (seq_vec)
Output: 1.0 1.5 2.0 2.5 3.0 3.5 4.0
Atomic vectors in R
Numeric vector
Integer vector
A non-fraction numeric value is known as integer data. An integer value can be
assigned to variable by appending L to the value.
Example: int_vec1<-c(1L,2L,3L,4L,5L)
class(int_vec1)
Output: “integer”
Character vector
A vector which contains character elements is known as an integer vector. In R
character data type value can be created using double quotes("") or single
quotes('').
Example: char_vec1<-c("shubham","arpita","nishka","vaishali")
Print(char_vec)
class(char_vec1)
Output: "shubham" "arpita" "nishka" "vaishali"
"character"
Logical vector
The logical data types have only two values i.e., True or False. These values are
based on which condition is satisfied. A vector which contains Boolean values is
known as the logical vector.
Example: d<- 5
e<- 6
f<- 7
log_vec<-c(d<e, d<f, e<d,e<f,f<d,f<e)
print(log_vec)
class(log_vec)
Output: TRUE TRUE FALSE TRUE FALSE FALSE
"logical"
We can access the elements of a vector with the help of vector indexing. Indexing
denotes the position where the value in a vector is stored.
Example: seq_vec<-seq(1,4,length.out=6)
BGS FIRST GRADE COLLEGE MYSURU SAHANA K
STATISTICA COMPUTING AND R PROGRAMMING
Print(seq_vec)
Print(seq_vec[2])
Output: 1.0 1.6 2.2 2.8 3.4 4.0
1.6
Vector Operation
1) Combining vectors
By combining one or more vectors, it forms a new vector which contains all the
elements of each vector.
Example: p<-c(1,2,4,5,7,8)
q<-c("shubham","arpita","nishka","gunjan","vaishali","sumit")
r<-c(p,q)
print (r)
Output: "1" "2" "4" "5" "7" "8"
"shubham" "arpita" "nishka" "gunjan" "vaishali" "sumit"
2) Arithmetic operations
We can perform all the arithmetic operation on vectors. The arithmetic operations
are performed member-by-member on vectors.
Example:
a<-c(1,3,5,7)
b<-c(2,4,6,8)
print (a+b)
print (a-b)
Output: 3 7 11 15
-1 -1 -1 -1
R Lists
In R, lists are the second type of vector. A list is a data structure which has
components of mixed data types. Lists are the objects of R which contain
elements of different types such as number, vectors, string and another list inside
it.
Lists creation
Example:
1) list_1<-list(1,2,3)
list_2<-list("Shubham","Arpita","Vaishali")
Output:
1
2
3
"Shubham"
"Arpita"
"Vaishali"
2) list_datalist("Shubham","Arpita",c(1,2,3,4,5),TRUE,FALSE,22.5,12L)
print(list_data)
Output:
"Shubham"
"Arpita"
12345
TRUE
FALSE
22.5
12
There are only three steps to print the list data corresponding to the name:
1. Creating a list.
2. Assign a name to the list elements with the help of names() function.
3. Print the list data.
Example
Output:
$Students
[1] "Shubham" "Nishka" "Gunjan"
$Marks
[,1] [,2] [,3]
[1,] 40 60 90
[2,] 80 70 80
$Course
$Course[[1]]
[1] "BCA"
$Course[[2]]
[1] "MCA"
$Course[[3]]
[1] "B. tech."
R provides two ways through which we can access the elements of a list.
1) First one is the indexing method performed in the same way as a vector.
2) In the second one, we can access the elements of a list with the help of
names.
Output:
"Shubham" "Arpita" "Nishka"
Output:
$Student
"Shubham" "Arpita" "Nishka"
Example
Output:
"Moradabad"
$<NA>
NULL
$Course
"Masters of computer applications"
There is a drawback with the list, i.e., we cannot perform all the arithmetic
operations on list elements.
This drawback can be overcome with the function unlist( ), this function converts
the list into vectors.
Example:
# Creating lists.
list1 <- list(1:5)
print(list1)
list2 <-list(10:14)
print(list2)
Output: 1 2 3 4 5
10 11 12 13 14
11 13 15 17 19
Merging Lists
To merge the lists, we have to pass all the lists into list function as a parameter,
and it returns a list which contains all the elements which are present in the lists.
Example
R Matrix
In R, a two-dimensional rectangular data set is known as a matrix.
A matrix is created with the help of the vector input to the matrix function.
Syntax: matrix(data,nrow,ncol,byrow,im_names)
data:- It is the input vector which is the data elements of the matrix.
byrow:- The byrow parameter is a logical clue. If its value is true, then the input
vector elements are arranged by row.
Print(p)
Output: 5 6 7
8 9 10
11 12 13
14 15 16
There are three ways to access the elements from the matrix.
1. We can access the element which presents on nth row and mth column.
2. We can access all the elements of the matrix which are present on the nth
row.
3. We can also access all the elements of the matrix which are present on the
mth column.
Example: For the above created R matrix, accessing the elements as follow
#Accessing element present on 3rd row and 2nd column
print(R[3,2])
In matrix modification, the first method is to assign a single element to the matrix
at a particular position. By assigning a new value to that position, the old value
will get replaced with the new one.
Here, n and m are the rows and columns of the element, respectively. And, y is
the value which we assign to modify our matrix.
Example:
R <- matrix(c(5:16), nrow = 4, byrow = TRUE, dimnames = list(row_names, co
l_names))
Output:
col1 col2 col3
row1 5 6 7
row2 8 9 10
row3 11 20 13
row4 14 15 16
#Adding column
cbind(R,c(17,18,19,20))
print(R)
Output:
col1 col2 col3
row1 5 6 7 17
row2 8 9 10 18
row3 11 12 13 19
row4 14 15 16 20
Matrix operations
In R, we can perform the mathematical operations on a matrix such as addition,
subtraction, multiplication, etc.
Example:
R <- matrix(c(5:16), nrow = 4,ncol=3)
S <- matrix(c(1:12), nrow = 4,ncol=3)
#Addition
sum<-R+S
print(sum)
#Subtraction
sub<-R-S
print(sub)
#Multiplication
mul<-R*S
print(mul)
#Division
div<-R/S
print(div)
Output:
[,1] [,2] [,3]
[1,] 6 14 22
[2,] 8 16 24
[3,] 10 18 26
[4,] 12 20 28
Applications of matrix
Arrays
In R, arrays are the data objects which allow us to store data in more than two
dimensions. In R, an array is created with the help of the array() function.
This array() function takes a vector as an input and to create an array it uses
vectors values in the dim parameter.
For example- if we will create an array of dimension (2, 3, 4) then it will create
4 rectangular matrices of 2 row and 3 columns.
Syntax:
array_name <- array(data, dim= (row_size, column_size, matrices, dim_na
mes))
row_size: This parameter defines the number of row elements which an array can
store.
dim_names: This parameter is used to change the default names of rows and
columns.
Creation of an Array:
There are only two steps to create a matrix which are as follows
Example:
Output:
,,1
[,1] [,2] [,3]
[1,] 1 10 13
[2,] 3 11 14
[3,] 5 12 15
,,2
[,1] [,2] [,3]
[1,] 1 10 13
[2,] 3 11 14
[3,] 5 12 15
In R, we can give the names to the rows, columns, and matrices of the array. This
is done with the help of the dim name parameter of the array() function.
Example:
#Creating two vectors of different lengths
vec1 <-c(1,3,5)
vec2 <-c(10,11,12,13,14,15)
Output:
, , Matrix1
, , Matrix2
Print(res[3,2,2]) #To print third row second column element of 2nd matrix
Output: 12
Manipulation of elements
Output:
,,1
BGS FIRST GRADE COLLEGE MYSURU SAHANA K
STATISTICA COMPUTING AND R PROGRAMMING
,,1
[,1] [,2] [,3]
[1,] 8 16 46
[2,] 4 73 36
[3,] 7 48 73
Data Frame
A data frame is a two-dimensional array-like structure or a table in which a
column contains values of one variable, and rows contains one set of values from
each column.
A data frame is a special case of the list in which each component has equal
length.
A matrix can contain one type of data, but a data frame can contain different data
types such as numeric, character, factor, etc.
In R, the data frames are created with the help of frame() function of data. This
function contains the vectors of any type such as numeric, character, or integer.
Output:
employee_id employee_name sal starting_date
1 1 Shubham 623.30 2012-01-01
2 2 Arpita 915.20 2013-09-23
3 3 Nishka 611.00 2014-11-15
4 4 Gunjan 729.00 2014-05-11
5 5 Sumit 843.25 2015-03-27
1. We can extract the specific columns from a data frame using the column
name.
Output:
emp.data.employee_id emp.data.sal
1 623.30
2 515.20
3 611.00
4 729.00
5 843.25
Extracting the specific rows from a data frame
Example
Output:
Example:
# Extracting 2nd and 3rd row corresponding to the 1st and 4th column
final <- emp.data[c(2,3),c(1,4)]
print(final)
Output:
employee_id starting_date
2 2013-09-23
3 2014-11-15
It is possible to add and delete rows and columns to the data frame.
Output:
Non-Numeric Values
Logical-values
Introduction to Logical-values
• Logical-values can only take on two values: TRUE or FALSE.
• Logical-values represent binary states like
- >yes/no
->one/zero
• Logical-values are used to indicate whether a condition has been met or not.
TRUE and FALSE Notation
• Logical-values are represented as TRUE and FALSE.
BGS FIRST GRADE COLLEGE MYSURU SAHANA K
STATISTICA COMPUTING AND R PROGRAMMING
Assigning Logical-values
• Example:
b1 <- TRUE
b2 <- FALSE
Creating Vectors
• Vectors can be filled with logical-values using T or F.
• Example:
myvec <- c(T,T,F,F,F)
Vector Length
• You can determine the length of a vector using the `length` function.
• Example:
`length(myvec)` returns 5
# Creating a vector
vector2 <- c(1, 2, 3, 4, 5)
# Checking if all elements are greater than 0
result <- all(vector2 > 0)
Output:
TRUE
• Example:
• Example:
`b1 <- c(T, F, F)`
`b2 <- c(F, T, F)`
`b1 && b2` returns `F`.
`b1 || b2` returns `T`.
Example:
# Creating a numeric vector
numeric_vector <- c(1, 2, 3, 4, 5)
String
• A string is a data type.
• It is used to represent text or character data.
• Strings can consist of almost any combination of characters, including numbers.
• Strings are commonly used for storing and manipulating textual information.
For e.g.: names, sentences, and text-data extracted from files or databases.
Creating a String
• You can create strings using single or double quotation marks.
• Example:
single_quoted <- 'This is a single-quoted string.'
double_quoted <- "This is a double-quoted string."
Example:
# Define a string
my_string <- "Hello, World!"
"Hello World"
Escape Sequences
Backslash (\) Usage
• The backslash (\) is used to invoke an escape sequence.
• Escape sequences allow you to enter characters that control the format and
spacing of the string.
Common Escape Sequences
`\n` starts a newline.
`\t` represents a horizontal tab.
`\b` invokes a backspace.
`\\` is used to include a single backslash.
`\"` includes a double quote.
sub() can be used for replacing the first occurrence of a pattern within a string
Example:
# Replacing the first occurrence of "apple" with "banana"
text <- "I like apples, but apples are red."
new_text <- sub("apple", "banana", text)
Output:
I like bananas, but apples are red."
gsub() can be used for replacing all occurrences of a pattern within a string.
Example:
# Replacing all occurrences of "apple" with "banana"
text <- "I like apples, but apples are red."
new_text <- gsub("apple", "banana", text)
Output:
I like bananas, but bananas are red."
Special Values
When a data set has missing observations or when a practically infinite number
is calculated the software has some unique terms reserved for these situations.
They are;
1) NA (Not Available):- If the value is not define, data value is out of range,
in such cases NA values be printed as output.
Example:
X< - c(1,2,3)
X[4]
Output: NA
BGS FIRST GRADE COLLEGE MYSURU SAHANA K
STATISTICA COMPUTING AND R PROGRAMMING
2) INF and -INF: When a number is to large for R to represent, the value is
given as Infinite.
Example:
r> 1 / 0
Output: INF
Output: NULL
Coercion
In R programming, converting from one object or data type to another object or
data type is referred as coercion.
Ther are two types of coercion. They are:
Implicit coercion: This type of coercion occurs automatically.
Example: The logical value True will be treated as 1 and False will be treated as
0.
Explicit coercion: This type of coercion can be done with the help of as.integer,
as.logical etc., functons.
Example:
X <- c(0,1,0,3)
Class (x)
Output: “numeric”
as.integer(x)
Class(x)
Output: “integer”
as.complex(x)
Class(x)
Output: “complex”
Basic plotting in R:
Example
Draw two points in the diagram, one at position (1, 3) and one in position (8, 10):
2) x<-c(1, 2, 3, 4, 5)
y<-c(3, 7, 8, 9, 12)
plot(x, y)
Draw a Line
The plot() function also takes a type parameter with the value l to draw a line to
connect all the points in the diagram:
plot(1:10, type="l")
Plot Labels
The plot() function also accept other parameters, such as main, xlab and ylab if
you want to customize the graph with a main title and different labels for the x
and y-axis:
Colors
Example:
plot(1:10, col="red")
Size
Point Shape
Use pch with a value from 0 to 25 to change the point shape format: