0% found this document useful (0 votes)
55 views81 pages

Muthulakshmi M: Software Technical Trainer IBM

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1/ 81

Muthulakshmi M

Software Technical Trainer


IBM
R Language
R Language
• R is an interpreted computer programming language which
was created by Ross Ihaka and Robert Gentleman at the
University of Auckland
•It is also a software environment used to analyze statistical
information, graphical representation, reporting, and data
modeling.
•R is the implementation of the S programming language, which is
combined with lexical scoping semantics.
Developed by?

❖ R Language developed Two Famous Statisticians Ross Ihaka and


Robert Gentleman.
Users of R
❖ Financial Protection Person’s uses R for data analysis
❖ Bank of America uses R for reporting.
❖ R is part of technology stack behind Foursquare’s famed
recommendation engine.
❖ ANZ, the fourth largest bank in Australia, using R for credit risk
analysis.
❖ Google uses R to predict Economic Activity.
❖ Mozilla, the foundation responsible for the Firefox web browser,
uses R to visualize Web activity.
History of R

•This programming language name is taken from the name of both the
developers. The first project was considered in 1992.
•The initial version was released in 1995, and in 2000, a stable beta
version was released.
History of R
Features of R
❖ R supports Object-oriented as well as Procedural programming.
❖ It provides an environment for statistical computation and software
development.
❖ Provides extensive packages & libraries.
❖ R has a wonderful community for people to share and learn from
experts.
❖ Numerous data sources to connect.
❖ includes all topics of R such as introduction, features, installation,
rstudio ide, variables, datatypes, operators, if statement, vector, data
handing, graphics, statistical modelling, etc.
Features of R

❖ It is also a software environment used to analyze statistical


information, graphical representation, reporting, and data
modeling.
❖ R is a free, open-source language used as a statistical and
visualization software.It can handle structured as well as semi-
structured data.
Major Use of R

❖ Statical Inference
❖ Data Analysis
❖ Machine Learning Algorithm
❖ Data Visualization.
IDEs for R Language

❖ R Studio.
❖ Visual Studio for R
❖ Eclipse for R
R Language- Statistical Algorithm

❖ R is a statistical software where complex stats models


like linear regression, logistic regression, hypothesis
testing,ANOVA(Analysis Of Variance),
GLM(Generalized Linear Model), etc., can be run
R Language -Visualization

❖ R has some great tools to aid data visualization to create


graphs, bar charts, multi-panel lattice charts, scatter
plots, and new custom-designed graphics
R Language- ML Algorithm
⮚ ML algorithms like SVM, NaivesBayes theorem,
XGboost, Decision tree, and Random forest are available in
R readily.
⮚These algorithms have proven to be better over time and
provide good accuracy of results.
Print() in R

⮚ print(“data Visualization
⮚Print(“Enter the value”)
Variables

⮚ Variables are the names which is given to the


computer memory locations which are used to store
values.
⮚ Value stored in a specific Location.
Variables
⮚Previously, we wrote all our code in a print() but we
don’t have a way to address them as to perform further
operations.
⮚This problem can be solved by using variables which
like any other programming language are the name
given to reserved memory locations that can store any
type of data.
Variables

In R, the assignment can be denoted in three ways:


⮚ = (Simple Assignment)
⮚ <- (Leftward Assignment)
⮚ -> (Rightward Assignment)
Key words in R

⮚Keywords are the words reserved by a program


because they have a special meaning thus a keyword
can’t be used as a variable name, function name, etc.
⮚We can view these keywords by using
 help(reserved)
Key words in R
Taking input from the user
SYNTAX:
Var= readline();
Var = as.integer(Var);

EXAMPLE:
A=readline()
A=as.integer(A)
print(A)
R

EXAMPLE:
var = readline(prompt = "Enter any number : ");
var = as.integer(var);
print(var)
Taking multiple input in R

⮚Taking multiple inputs in R language is same as


taking single input, just need to define
multiple readline() for inputs.
⮚ One can use braces for define
multiple readline() inside it.
Taking multiple input in R
# taking multiple inputs using braces 
{
    var1 = readline("Enter 1st number : ");
    var2 = readline("Enter 2nd number : ");
    var3 = readline("Enter 3rd number : ");
    var4 = readline("Enter 4th number : ");
}
  # converting each value
var1 = as.integer(var1);
var2 = as.integer(var2);
var3 = as.integer(var3);
var4 = as.integer(var4);
 # print the sum of the 4 number
print(var1 + var2 + var3 + var4).
Taking multiple input in R
c=readline(prompt="enter the string")
d=readline(prompt="enter the character")
d=as.character(d)
print(c)
print(d)
Data Types
Data types are used to store information.
 In R, we do not need to declare a
variable as some data type.
The variables are assigned
with R-Objects and the data type
of the R-object becomes the
data type of the variable. 
Data Structures in R
⮚A data structure is a particular way of organizing
data in a computer so that it can be used effectively.
⮚The idea is to reduce the space and time
complexities of different tasks. Data structures in R
programming are tools for holding multiple values.
Data Structures in R

⮚R’s base data structures are often organized


by their dimensionality (1D, 2D, or nD) and
whether they’re homogeneous (all elements
must be of the identical type) or heterogeneous
(the elements are often of various types).
⮚This gives rise to the five data types which
are most frequently utilized in data
analysis.
⮚the subsequent table shows a transparent
cut view of those data structures.
Object in R
There are mainly six data types present in R:
Vector
List
Array
Matrices
Data Frame
Factors
There are mainly six data types present in R:
Vector- a combination of values
List – can contain many different type of object
Array – multi dimensional
Matrices – two dimensional
Data Frame – tabular data object with multiple type
Factors- factor stores vector along with tables
Vector
⮚A vector is an ordered collection of basic data types of
a given length.
⮚The only key thing here is all the elements of a vector
must be of the identical data type e.g homogenous data
structures. Vectors are one-dimensional data structures
.
Vector
 
A Vector is a one-dimensional sequence
of data elements of the same basic type.
A vector is like a list but stores similar types of data
•Numeric Vector (1,808,6527,742,268)
•Integer Vector ( positive and negative numbers)
•Character vector (“a”, “efjvfVF”, “fbyvkd”)
•Logical vector (True/False)
•Complex vector (complex numbers of a+bi form)
Creating Vector

Syntax: variable<-c(values)
Example :
#vectors
x=c(2,4,6,8,10)
print(x)
.
Creating Vector
Syntax: variable<-c(values)
C()- Function is used to create a vector.
R<-c(1:5)
Language<-c(“c”, “c++”, “java”, “python”, “r”)
Number<-c(1,2,3,4,5)
Number<-c(c=1,c++=2,java=3,python=4,r=5)
Number<-c (“c”=1, “c++”=2, “java”=3, “python”=4, “r”=5)
1byte of memory space will occupied by Vector.
We can create Vector in another way.
The method scan() is used in create a vector at Run time.
example<-scan()
scan() function allow as to give input in output area
Vectors are Homogeneous

Only element from same data type.


Program<-c(1, “c++”, 3, “python”, “r”)
Vector take default data type character.
Vectors Function
Class()- To find the types of data.
Head()- Return First 6 elements.
Tail()- Return last 6 elements.
Length()- Count of the vector.
Mean()- Average value.
Sum()- Total value.
Median()- Middle value.
Scan()- another way to create vector.
Max(),Min(), etc…
Array
⮚Arrays are the R data objects which store the data in more than
two dimensions.
⮚Used to store ordered list of values of same type
(Homogenious)
⮚ Arrays are n-dimensional data structures. For example, if we
create an array of dimensions (2, 3, 3) then it creates 3
rectangular matrices each with 2 rows and 3 columns. They are
homogeneous data structures.

.
Array
•Create
•Access
•Modify all we can do in array
Syntax:
array_name <- array(data, dim= (row_size, column_size, matrices, di
m_names)) 
.
Array
The data is the first argument in the array() function. It is an input vector which is
given to the array.
matrices
In R, the array consists of multi-dimensional matrices.
row_size
This parameter defines the number of row elements which an array can store.
column_size
This parameter defines the number of columns elements which an array can store.
dim_names
This parameter is used to change the default names of rows and columns.
Array
#Creating two vectors of different lengths  
vec1 <-c(1,3,5)  
vec2 <-c(10,11,12,13,14,15)  
  
#Taking these vectors as input to the array   
res <- array(c(vec1,vec2),dim=c(3,3,2))  
print(res)  
Matrix

⮚A matrix is a rectangular arrangement of numbers in rows and


columns.
⮚ In a matrix, as we know rows are the ones that run
horizontally and columns are the ones that run vertically.
⮚ Matrices are two-dimensional, homogeneous data structures.
Matrix
⮚To create a matrix in R you need to use the function called
matrix.
⮚The arguments to this matrix() are the set of elements in the
vector.
⮚Element type is homogeneous
⮚You have to pass how many numbers of rows and how many
numbers of columns you want to have in your matrix and this is
the important point you have to remember that by default,
matrices are in column-wise order.
Matrix
Like vector and list, R provides a function which creates a matrix. R provides the matrix()
function to create a matrix.jujh  
matrix(data, nrow, ncol, byrow, dim_name)  
data
The first argument in matrix function is data. It is the input vector which is the data elements of
the matrix.
nrow
The second argument is the number of rows which we want to create in the matrix.
ncol
The third argument is the number of columns which we want to create in the matrix.
byrow
The byrow parameter is a logical clue. If its value is true, then the input vector elements are
arranged by row.
dim_name
The dim_name parameter is the name assigned to the rows and columns
EXAMPLE:
a=matrix(c(1,2,3,4,5,6,7,8,9)
nrow=3,
ncol=3,
byrow=TRUE)
print(a)
Print (a[1,3]) // 3
EXAMPLE:
a=matrix(c(3:8),nrow=2,byrow=TRUE)
b=matrix(c(1:6),nrow=2,byrow=TRUE)
print(a+b) //3 4 5
678

123
456
o/p: 4 6 8
10 12 14
Factors
⮚Factors are the data objects which are used to categorize the
data and store it as levels.
⮚They are useful for storing categorical data. They can store
both strings and integers.
⮚They are useful to categorize unique values in columns like
“TRUE” or “FALSE”, or “MALE” or “FEMALE”, etc.. They
are useful in data analysis for statistical modeling.
Factors

⮚The factor is a data structure which is used for fields which


take only predefined finite number of values.
⮚These are the variable which takes a limited number of
different values.
Factors

EXAMPLE:
fac = factor(c("Male", "Female", "Male",
               "Male", "Female", "Male", "Female"))
print(fac)
.
Factors
EXAMPLE:
data <- c("Nishka","Gunjan","Shubham","Arpita","Arpita","Sumit","Gunjan","Shubham
")  
# Creating the factors  
factor_data<- factor(data)  
print(factor_data)  
  
# Apply the factor function with the required order of the level.  
new_order_factor<- factor(factor_data,levels = c("Gunjan","Nishka","Arpita","Shubham
","Sumit"))  
print(new_order_factor)  
Factors

EXAMPLE:
gen_factor<- gl(3,5,labels=c("BCA","MCA","B.Tech"))  
gen_factor
 
Only element from same data type.
Program<-c(1, “c++”, 3, “python”, “r”)
Vector take default data type character.
Vectors are Homogeneous

Only element from same data type.


Program<-c(1, “c++”, 3, “python”, “r”)
Vector take default data type character.
Data Frame

A data frame is a two-dimensional array-like structure or a table


in which a column contains values of one variable, and rows
contains one set of values from each column.
A data frame is a special case of the list in which each
component has equal length.
A data frame is used to store data table and the vectors which are
present in the form of a list in a data frame, are of equal length.
Data frame
# Creating the data frame.  
emp.data<- data.frame(  
employee_id = c (1:5),   
employee_name = c("Shubham","Arpita","Nishka","Gunjan","Sumit"),  
sal = c(623.3,915.2,611.0,729.0,843.25),   
  
starting_date = as.Date(c("2012-01-01", "2013-09-23", "2014-11-15", "2014-05-11",  
      "2015-03-27")),  
stringsAsFactors = FALSE  
)  
# Printing the data frame.            
print(emp.data)  
Data frame
# Creating the data frame.  
A<-seq(2,20,22)
B<-seq(3,30,33)
Df=data.frame(a,b,c)
Df

o/p:

a b c
1 2 20 22
2 3 30 33
Extracting
Program<-c(1:4)
Program[5]
Program[c(1,5,9)]
Program[-1], Program[-5]
Program[-length(program)]
Program[seq(2, length(Program),2)]-- find even number.
Program[seq(3,length(program),3]-- odd number.
sum(Program<10)
Program<10
sum(Program[Program<15])
sum(Program[Program%%15==0])
cor(x,y) – finding correlation
cumsum(Program) – cumulative summation.
cumprod(Program) – cumulative Product.
fivenum(Program)
Sorting
sort(Program)
rev(sort(Program))
sum(rev(sort(Program))[1:3])
which(Program==max(Program))
Program[which(Program==max(Program))]
which(Program==min(Program))
Program[which(Program==min(Program))]
rank(Program)
order(Program)
quantile(Program)
If condition

⮚if statement consists of the Boolean expressions followed by one or


more statements.
⮚The if statement is the simplest decision-making statement which
helps us to take a decision on the basis of the condition.
⮚The if statement is a conditional programming statement which
performs the function and displays the information if it is proved true.
⮚The block of code inside the if statement will be executed only when
the boolean expression evaluates to be true. If the statement evaluates
false, then the code which is mentioned after the condition will run.
If condition
if(boolean_expression) {  
   // If the boolean expression is true, then statement(s) will be executed
  } 
Example:
x <-24L  
y <- "shubham"  
if(is.integer(x))  
{  
    print("x is an Integer")  
}  
If condition
x <-20  
y<-24  
count=0  
if(x<y)  
{  
    cat(x,"is a smaller number\n")  
    count=1  
}  
if(count==1){  
    cat("Block is successfully execute")  
}  
If condition
x <-1  
y<-24  
count=0  
while(x<y){  
    cat(x,"is a smaller number\n")  
    x=x+2  
    if(x==15)  
        break  

If condition
x <-5 
if(x>10)  
{
Print (past(x,”greater than 10”))
}
else
{
print (past(x,”less than 10”))
}
Looping

⮚For loop : execute statement multiple times and check the condition
at the end
⮚While loop : execute code till condition is satisfied .
⮚Repeat loop : Repeats the statement multiple times .
For loop
⮚Syntax:
for (variable in vector )
{
condition
}
Example:
For(i in 1:5)
{
Print(i^2)
}
For loop
X<- c(1,2,3,4)
For(i in x)
{
Print(i,i^2)
}
Example 2:
A<- LETERS(1:10)
For i in a
{
Print(i)
}
While loop

⮚A while loop is a type of control flow statements which is used to


iterate a block of code several numbers of times.
⮚ The while loop terminates when the value of the Boolean expression
will be false.
⮚In while loop, firstly the condition will be checked and then after the
body of the statement will execute.
⮚In this statement, the condition will be checked n+1 time, rather than
n times..
While loop

⮚Syntax:
while (test_expression) {  
   statement  
}  
While loop

v <- c("Hello","while loop","example")  
cnt <- 2  
while (cnt < 7) {  
   print(v)  
   cntcnt = cnt + 1  
}}  
Repeat loop

•A repeat loop is used to iterate a block of code. It is a special type of


loop in which there is no condition to exit from the loop. For exiting,
we include a break statement with a user-defined condition. This
property of the loop makes it different from the other loops.
•A repeat loop constructs with the help of the repeat keyword in R. It is
very easy to construct an infinite loop in R.
repeat loop

Syntax:
repeat {   
   commands   
   if(condition) {  
      break  
   }  
}  
Repeat loop
v <- c("Hello","repeat","loop")  
cnt <- 2  
repeat {  
   print(v)  
   cnt <- cnt+1  
     
   if(cnt > 5) {  
      break  
   }  
}  
Function in R
•A set of statements which are organized together to perform a specific
task is known as a function. R provides a series of in-built functions,
and it allows the user to create their own functions. Functions are used
to perform tasks in the modular approach.
•Functions are used to avoid repeating the same task and to reduce
complexity. To understand and maintain our code, we logically break it
into smaller parts using the function. A function should be
•Written to carry out a specified task.
•May or may not have arguments
•Contain a body in which our code is written.
•May or may not return one or more output values.
Function in R
Function Name
The function name is the actual name of the function. In R, the function is stored as an
object with its name.
Arguments
In R, an argument is a placeholder. In function, arguments are optional means a function
may or may not contain arguments, and these arguments can have default values also. We
pass a value to the argument when a function is invoked.
Function Body
The function body contains a set of statements which defines what the function does.
Return value
It is the last expression in the function body which is to be evaluated.
Function in R
Function Name
The function name is the actual name of the function. In R, the function is stored as an
object with its name.
Arguments
In R, an argument is a placeholder. In function, arguments are optional means a function
may or may not contain arguments, and these arguments can have default values also. We
pass a value to the argument when a function is invoked.
Function Body
The function body contains a set of statements which defines what the function does.
Return value
It is the last expression in the function body which is to be evaluated.
Function in R

•Afunc<-function(a)
{
For (i in 1:a ) //1,2,3
{
B<-i*2
Print (b)
}
}
afun(3)
Class in R

•class is the most popular class in the R programming language. Most


of the classes that come predefined in R are of this type.
•First we create a list with various components then we create a class
using the class() function
Class in R

•# create a list with required components


•student1 <- list(name = "John", age = 21, GPA = 3.5)

•# name the class appropriately


•class(student1) <- "Student_Info"

•# create and call an object


•student1

You might also like