Statistics With R Programming For Bigdata (Autosaved)
Statistics With R Programming For Bigdata (Autosaved)
on
STATISTICS with R PROGRAMMING for DATA SCIENCE
Dr.A.MANIMARAN B.E,M.E,Ph.D
PROFESSOR,
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
SAVEETHA SCHOOL OF ENGINEERING, SIMATS, CHENNAI
In this Lecture
R and R Studio
How do
Set the working directory
Create an R file and save it
Execute an R file
Variable
Basic Data Types
Advance Data Structure
Function
classes
R
R was created by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand
Open Source Programming Language
R is world’s most widely used statistics programming language and graphics
Statistical Software and Data Analysis Tool
Command Line Interface
Platforms,
Windows,
Line X
Macos
What is R studio?
OutPut
AIDS
The values of the variables can be printed using print() or cat() function.
The cat() function combines multiple items into a continuous print output
Variable
EXAMPLE
A variable is a memory allocated
for the storage of specific data var1 = "hello"
print(var1)
R Variables Syntax
# using leftward operator
• Using equal to operators var2 < - "hello"
variable_name = value print(var2)
• using leftward operator
# using rightward operator
variable_name <- value
"hello" -> var3
• using rightward operator print(var3)
value -> variable_name
Kept in Mind
No keywords
R DATA TYPES
Basic Data Types Values Examples
Numeric Set of all real numbers "numeric_value <- 3.14"
Syntax:
is.data_type()
# Logical
print(is.logical(TRUE))
# Integer
print(is.integer(3L))
# Numeric
print(is.numeric(10.5))
# Complex
print(is.complex(1+2i))
# Character
print(is.character("12-04-2020"))
print(is.integer("a"))
print(is.numeric(2+3i))
Convert The Data Type Of An Object To Another
Syntax
as.data_type(object)
# Complex
print(as.character(1+2i))
# Can't possible
print(as.numeric("12-04-2020"))
# Numeric
print(as.logical(10.5))
Advance Data Structure (Data Types)
A data structure is a particular way of organizing data in a computer so
that it can be used effectively
• Vectors
• Lists
• Dataframes
• Matrices
• Arrays
• Factors
Vectors
By indices
To Access top level components, use double slicing operator “[[]]” or [], and for
lower /inner level components use “[]” along with “[[]]”,
Print(emplist[1])
Print(emplist[1][2])
Manipulating Lists
Concatenation of List:
li=c(list1,list2)
Matrices
A matrix is a rectangular arrangement of numbers in rows and
columns
function matrix()
matrix(data, nrow, ncol, byrow, dimnames)
Matrices
M <- matrix(c(3:14), nrow = 4, byrow = TRUE)
print(M)
# Elements are arranged sequentially by column.
N <- matrix(c(3:14), nrow = 4, byrow = FALSE)
print(N)
# Define the column and row names.
rownames = c("row1", "row2", "row3", "row4")
colnames = c("col1", "col2", "col3")
P <- matrix(c(3:14), nrow = 4, byrow = TRUE, dimnames =
list(rownames, colnames))
print(P)
DATA FRAME
CREATE
Access rows and columns
Edit
Add new rows and columns
Dataframes
Dataframes are generic data objects of R which are used to store the
tabular data
Function data.frame()
Age = c(22, 25, 45)
Language = c("R", "Python", "Java")
Name = c("Amiya", "Raj", "Asish")
df = data.frame(Name, Language, Age)
print(df)
Arrays
.
FUNCTION
Block of code which runs only when it is called function.
It has some inputs called arguments, and an output called the return value.
Creating a Function in R
by using the command function()
)
a=list(name="manimaran",Rollno=101)
print.Student <- function(obj)
{
cat("name: " ,obj$name, "\n")
cat("Roll No: ", obj$Roll_No, "\n")
}
print(a)
S4 CLASS
S4 class has a predefined definition. It contains functions for defining methods
and generics
setClass()
Syntax:
setClass(“myclass”,slots=list(name=”character”,
Roll_No=”numeric”))
new() function is used to create an object of the S4 class
pass the class name as well as the value for the slots.
S4 CLASS
setClass("Student",slots=list(name="character",
Roll_No="numeric"))
a <- new("Student", name="Adam", Roll_No=20)
a
R Programming Structure
for(value in vector)
{
statements .... ....
}
for (i in 1: 4)
{
print(i ^ 2)
}
Repeat loop-To
iterate over a block of code
multiple number of times.
It executes the same code again and
again until a break statement is found.
Syntax:
Repeat
{ commands
if(condition)
{
break
}
}
Example
while (test_expression)
{
Statement
update_expression
}
# R program to illustrate while loop
result <- c("Hello World")
i <- 1
while (i < 6) {
print(result)
i=i+1
}
[
Next Statement
Syntax: {