0% found this document useful (0 votes)
28 views

Statistics With R Programming For Bigdata (Autosaved)

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views

Statistics With R Programming For Bigdata (Autosaved)

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 41

Guest lecture

on
STATISTICS with R PROGRAMMING for DATA SCIENCE

Dr.A.MANIMARAN B.E,M.E,Ph.D
PROFESSOR,
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
SAVEETHA SCHOOL OF ENGINEERING, SIMATS, CHENNAI
In this Lecture

 R and R Studio
 How do
 Set the working directory
 Create an R file and save it
 Execute an R file
 Variable
 Basic Data Types
 Advance Data Structure
 Function
 classes
R

 R was created by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand
 Open Source Programming Language
 R is world’s most widely used statistics programming language and graphics
 Statistical Software and Data Analysis Tool
 Command Line Interface
 Platforms,
 Windows,
 Line X
 Macos
What is R studio?

 Integrated Development Environment(IDE) for R

 available -Open source and Commercial software

 Edition-Desktop version and Server version


A first look of R studio
Basic Program(syntax)
 Depart<-” Welcome AIDS ”
 Print(Depart)/ cat(“depart”, Depart)

 OutPut
 AIDS

 The values of the variables can be printed using print() or cat() function.
 The cat() function combines multiple items into a continuous print output
Variable

EXAMPLE
 A variable is a memory allocated
for the storage of specific data  var1 = "hello"
print(var1)
 R Variables Syntax
# using leftward operator
• Using equal to operators var2 < - "hello"
variable_name = value print(var2)
• using leftward operator
# using rightward operator
variable_name <- value
"hello" -> var3
• using rightward operator print(var3)
value -> variable_name
Kept in Mind

 Allowed characters are Alphanumeric, “ _” “.”.

 Always Start With alphabets.

 No special characters like @,$ etc

 No keywords
R DATA TYPES
Basic Data Types Values Examples
Numeric Set of all real numbers "numeric_value <- 3.14"

Integer Set of all integers, Z "integer_value <- 42L"

Logical TRUE and FALSE "logical_value <- TRUE"


"complex_value <- 1 +
Complex Set of complex numbers
2i"
“a”, “b”, “c”, …, “@”,
"character_value <-
Character “#”, “$”, …., “1”, “2”,
"Hello Geeks"
…etc
TYPE VERIFICATION

 Syntax:
is.data_type()
# Logical
print(is.logical(TRUE))
# Integer
print(is.integer(3L))
# Numeric
print(is.numeric(10.5))
# Complex
print(is.complex(1+2i))
# Character
print(is.character("12-04-2020"))
print(is.integer("a"))
print(is.numeric(2+3i))
Convert The Data Type Of An Object To Another

 Syntax
as.data_type(object)

# Complex
print(as.character(1+2i))

# Can't possible
print(as.numeric("12-04-2020"))

# Numeric
print(as.logical(10.5))
Advance Data Structure (Data Types)
 A data structure is a particular way of organizing data in a computer so
that it can be used effectively
• Vectors
• Lists
• Dataframes
• Matrices
• Arrays
• Factors
Vectors

 Vectors contain a sequence of homogeneous types of data.


 Atomic Vector
 Integer
 Double
 Logical
 Character
 Complex
 Raw
 Recursive Vector
 list
 The function c() :
x <- c(1, -1, 3.5, 2)
Print(x)
print(typeof(x))
Output: 1,-1,3.5,2
Lists
 A list is a generic object consisting of an ordered collection of objects.
 Lists are heterogeneous data structures
 The function list()
empId = c(1, 2, 3, 4)
empName = c("Debi", "Sandeep", "Subham", "Shiba")
empList = list(empId, empName)
Print(empList)
Output:
1 2 3 4
"Debi" "Sandeep" "Subham" "Shiba"
Accessing Components

 By name(all components of a list can be named)


 empList=list("ID" = empId, "names"=empName)
 print(empList$names)

 By indices
 To Access top level components, use double slicing operator “[[]]” or [], and for
lower /inner level components use “[]” along with “[[]]”,
 Print(emplist[1])
 Print(emplist[1][2])
Manipulating Lists

 A List can be modified by accessing Components & replacing them


 empList[[2]][5]="manimaran“
 print(empList)

 Concatenation of List:
 li=c(list1,list2)
Matrices
 A matrix is a rectangular arrangement of numbers in rows and
columns
 function matrix()
 matrix(data, nrow, ncol, byrow, dimnames)
Matrices
M <- matrix(c(3:14), nrow = 4, byrow = TRUE)
print(M)
# Elements are arranged sequentially by column.
N <- matrix(c(3:14), nrow = 4, byrow = FALSE)
print(N)
# Define the column and row names.
rownames = c("row1", "row2", "row3", "row4")
colnames = c("col1", "col2", "col3")
P <- matrix(c(3:14), nrow = 4, byrow = TRUE, dimnames =
list(rownames, colnames))
print(P)
DATA FRAME

 CREATE
 Access rows and columns
 Edit
 Add new rows and columns
Dataframes
 Dataframes are generic data objects of R which are used to store the
tabular data
 Function data.frame()
Age = c(22, 25, 45)
Language = c("R", "Python", "Java")
Name = c("Amiya", "Raj", "Asish")
df = data.frame(Name, Language, Age)
print(df)
Arrays

 Arrays are the R data objects which store the data in


more than two dimensions. Arrays are n-dimensional
data structures
function array()

A = array( c(1, 2, 3, 4, 5, 6, 7, 8), dim = c(2, 2, 2) )


print(A)
Factors

 Factors in R Programming Language are data structures that


are implemented to categorize the data or represent categorical
data and store it on multiple levels.

.
FUNCTION
 Block of code which runs only when it is called function.
It has some inputs called arguments, and an output called the return value.
Creating a Function in R
 by using the command function()
)

TYPES OF FUNCTION IN R LANGUAGE

Built-in Function: User-defined Function


 R language allow us to write our own
Built-in functions in R are pre-defined functions
function
Find sum of numbers 4 to 6. evenOdd = function(x)
print(sum(4:6)) {
if(x %% 2 == 0)
return("even")
# Find max of numbers 4 and 6. else
print(max(4:6)) return("odd")
}
# Find min of numbers 4 and 6. print(evenOdd(4))
print(min(4:6)) print(evenOdd(3))
Example 1 Example square
2 = function(x)
mean2 <- function(x) {
{ x^2
n <- length(x) }
sum(x)/n
} square(4)
mean2(1:10)
Recursive Function in R

 Recursion is when the function calls itself.


 This forms a loop, where every time the function is called, it calls itself again and
again and this technique is known as recursion.

rec_fac <- function(x){


if(x==0 || x==1)
{
return(1)
}
else
{
return(x*rec_fac(x-1))
}
}
Find the sum of squares of a given series of numbers.
Sum = 12+22+…+N2
sum_series <- function(vec){
if(length(vec)<=1)
{
return(vec^2)
}
else
{
return(vec[1]^2+sum_series(vec[-1]))
}
}
series <- c(1:10)
sum_series(series)
R – OBJECT ORIENTED PROGRAMMING
Class and Object

 Class is the blueprint or a prototype


from which objects are made by
encapsulating data members and
functions.
 An object is a data structure that
contains some methods that act upon
its attributes.
 S3 class
 S4 class
 Reference class
S3 CLASS

 A list that will contain all the class members


 Then this list is passed to the class() method as an argument
 Syntax:
 variable_name<-list(attribute1,attribute2, attribute3….attributeN)
 # List creation with its attributes name and roll no.
 a <- list(name="Adam", Roll_No=15)
 # Defining a class "Student"
 class(a) <- "Student"
 # Creation of object
 a
S3 CLASS

a=list(name="manimaran",Rollno=101)
print.Student <- function(obj)
{
cat("name: " ,obj$name, "\n")
cat("Roll No: ", obj$Roll_No, "\n")
}
print(a)
S4 CLASS
 S4 class has a predefined definition. It contains functions for defining methods
and generics
setClass()

 Syntax:
setClass(“myclass”,slots=list(name=”character”,
Roll_No=”numeric”))
 new() function is used to create an object of the S4 class
 pass the class name as well as the value for the slots.
S4 CLASS

setClass("Student",slots=list(name="character",
Roll_No="numeric"))
a <- new("Student", name="Adam", Roll_No=20)
a
R Programming Structure

Loop statements  Flow chart


for loop
Syntax:

for(value in vector)
{
statements .... ....
}
for (i in 1: 4)
{
print(i ^ 2)
}
 Repeat loop-To
iterate over a block of code
multiple number of times.
 It executes the same code again and
again until a break statement is found.
 Syntax:

Repeat
{ commands
if(condition)
{
break
}
}
Example

[1] "Hello World“


result <- c("Hello World")
i <- 1 [1] "Hello World“
repeat {
print(result) [1] "Hello World“
i <- i + 1
if(i >5) { [1] "Hello World“
break
} [1] "Hello World"
}
 R- While loop Syntax :

while (test_expression)
{
Statement
update_expression
}
# R program to illustrate while loop
result <- c("Hello World")
i <- 1
while (i < 6) {
print(result)
i=i+1
}
[

Next Statement

It discontinues a particular iteration and  Output:


jumps to the next iteration
for (i in c(3, 6, 23, 19, 0, 21))
[1] 3
{ [1] 6
{
[1] 23
next [1] 19
}
print(i)
[1] 21
} [1] Outside loop
print('Outside Loop’)
Break statement

 The break keyword is a jump no <- 1:10


statement that is used to terminate the for (val in no)
loop at a particular iteration. {
if (val == 5)

 Syntax: {

if (test_expression) print(paste("Coming out from for loop Where i = ",


val))
{
break
Break
}
}
print(paste("Values are: ", val))
}
Decision Making in R Programming
if statement
if(condition is true)
{
execute this statement
}
a <- 76
b <- 67
if(a > b)
{ c <- a - b
print("condition a > b is TRUE")
print(paste("Difference between a, b is : ", c))
}
a <- 67
 b <- 76
Syntax:
if-else statement if(a > b)
if(condition is true) {
{ c <- a - b
print("condition a > b is TRUE")
execute this statement
print(paste("Difference between a, b is : ", c))
} } else
else { {
execute this statement c <- a - b
print("condition a > b is FALSE")
}
print(paste("Difference between a, b is : ", c))
}
THANKS

You might also like