0% found this document useful (0 votes)
9 views77 pages

DataAnalyticsUsingR Dr.P.rajesh

The document outlines a national level webinar on Data Analytics using R programming, emphasizing the importance of data analytics in decision-making for businesses. It details various types of data analytics, including descriptive, diagnostic, predictive, and prescriptive analytics, along with the features and advantages of using R programming for data analysis. Additionally, it covers the applications of R in different industries and provides insights into RStudio as an integrated development environment for R.

Uploaded by

rishikareddy713
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views77 pages

DataAnalyticsUsingR Dr.P.rajesh

The document outlines a national level webinar on Data Analytics using R programming, emphasizing the importance of data analytics in decision-making for businesses. It details various types of data analytics, including descriptive, diagnostic, predictive, and prescriptive analytics, along with the features and advantages of using R programming for data analysis. Additionally, it covers the applications of R in different industries and provides insights into RStudio as an integrated development environment for R.

Uploaded by

rishikareddy713
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 77

One Day National Level Webinar on

Data Analytics
using R
Programming
Dr. P. Rajesh
Assistant Professor
PG Department of Computer Science
Government Arts College
C.Mutlur, Chidambaram.
Email: [email protected]

Arignar Anna Government Arts College, Villupuram. Date 22.07.2020, Time 10.00 am to 12.00
pm
Data Analytics
 Data Science and Data Analytics are two most trending terminologies
of today’s time.
 Data is collected into raw form and processed according to the
requirement of a company.
 This data is utilized for the decision making purpose.
 This process helps the businesses to grow in the market.
 But, the main question arises – What is the process called?
 Data Analytics is the answer here. and, Data Analyst and Data
Scientist are the ones who perform this process.
What is Data Analytics?
 Data or information is in raw format.
 The increase in size of the data has led to arise
 In need for carrying out inspection, data cleaning and
transformation.
 Data modeling to gain insights from the data in order to derive
conclusions for better decision-making process.
 This process is known as data analysis.
 The analysis is an interactive process of a person tackling a
problem, finding the data required to get an answer, analyzing that
data, and interpreting the results in order to provide a
https://www.kdnuggets.com/2017/07/4-types-data-analytics.html
1. Descriptive:
What is happening?

 The help of descriptive analysis, we


analyze and describe the features of a
data.
 It deals with the summarization of
information.
 In the descriptive analysis, we deal
with the past data to draw conclusions
and present our data in the form of
actionable results.
2. Diagnostic:

Why is it happening?
 Diagnostic analytics is a form of
advanced analytics that examines
data or content to answer the
question,
 It is characterized by techniques
such as data discovery, data mining
and correlations.
3. Predictive: What is likely to
happen?
 With the help of predictive analysis,
determine the future outcome.
 Based on the analysis of the historical
data, we are able to forecast the future.
 With the help of data analytics,
technological advancements and machine
learning, we are able to obtain predictive
about the future effectively.
 Predictive analytics is a complex field that
requires a large amount of data, skilled
implementation of predictive models.
 Its tuning to obtain accurate predictions.
4. Prescriptive:
What do I need to do?
 Understanding of
 what has happened,
 why it has happened.
 variety of “what-might-
happen” analysis.
 help the user determine the
best solutions of action to
take.
 Prescriptive analysis is
typically not just with one
individual action but is in
fact a host of other actions.
 Best route home and
considering the distance of
each route, the speed
What is the types of Data in Data Analytics
History
History
R is a programming language and software environment for
statistical analysis, graphics representation and reporting.
R was created by Ross Ihaka and Robert Gentleman at the
University of Auckland, New Zealand.
R is freely available under the GNU General Public License.
R provided for various operating systems like Linux, Windows and
Mac.
This programming language was named R, based on the first letter
of first name of the two R authors (Robert Gentleman and Ross
Ihaka).
 R play on the name of the Bell Labs Language S.

 R was initially written at the Department of Statistics of the

University of Auckland in Auckland, New Zealand.

 R made its first appearance in 1993.

 Since mid-1997 there has been a core group (the "R Core Team")

who can modify the R source code archive.

14
Why Learn R Programming Language
 With R, you can perform statistical analysis, data analysis as well as machine
learning.
 We can create objects, functions and packages in it.
 R is platform-independent and can be used across multiple operating systems.
 R is free owing to its open-source GNU licensing and can be installed by
anyone.
 R consists of a robust collection of graphical libraries like ggplot2, plotly and
many more.
 R is most widely used by the various industries like health, finance, banking,
manufacturing and many more.
 There are about 2 million job openings for R programmers worldwide.
 Companies hire R programmers for many roles like data analysts, business
15
16
Features of R
 As stated earlier, R is a programming language and software
environment for statistical analysis, graphics representation and
reporting. The following are the important features of R −
 R is a well-developed, simple and effective programming language
which includes conditionals, loops, user defined recursive
functions and input and output facilities.
 R has an effective data handling and storage facility,
 R provides a suite of operators for calculations on arrays, lists,
vectors and matrices.
 R provides a large, coherent and integrated collection of tools for
data analysis. 17
How R is better than Other Technologies
There are certain unique aspects of R programming which makes it better in
comparison with other technologies:
•Graphical Libraries – Libraries like ggplot2, plotly facilitate appealing libraries for
making well-defined plots.
•Availability / Cost – R is completely free.
•Advancement in Tool – R supports various advanced tools and features that allow
you to build robust statistical models.
•Job Scenario – The immense growth in Data Science and rise in demand, R has
become the most in-demand programming language of the world today.
•Customer Service Support and Community – With R, you can enjoy strong
community support.
•Portability – R is highly portable. Many different programming languages and
software frameworks can easily combine with the R environment for the best results.
18
Sourcing of R Script
RStudio
• RStudio is an Integrated Development Environment for R.
• It facilitates extensive code editing, development as well as various
features.
Features of RStudio
• RStudio provides various tools and features that allow you to boost your
code productivity.
• It can also be accessed over the web and is cross-platform in nature.
• It facilitates automatic checking of updates
• It provides support for recovery in case of file loss.
• With RStudio, you can manage the data more efficiently. 19
Components of RStudio

• Source – In the top left corner of the screen is the text editor that

allows you to work within source scripting. You can enter multiple lines

in this source.

• Console – This is present on the bottom left corner of the main window

of R Studio. It facilitates interactive scripting in R.

Workspace and History – In the top right corner, the R workspace and

the history window. This will give you the list of all the variables and view

the list of past commands that were executed by R. 20


21
22
Applications of R Programming
 finance and banking sectors for detecting fraud, reducing customer churn
rate and for making future decisions.
 bioinformatics to analyze strands of genetic sequences, for performing
drug discovery and also in computational neuroscience.
 Social media analysis to discover potential customers in online advertising.
 Companies also use social media information to analyze customer
sentiments for making their products better.
 E-Commerce companies make use of R to analyze the purchases made by
the customers as well as their feedbacks.
 Manufacturing companies use R to analyze customer feedback.
 They also use it to predict future demand to adjust their manufacturing
23
speeds and maximize profits.
Companies Using R
Some of the companies that are using R programming are as follows:
• Facebook
• Google
• Linkedin
• IBM
• Twitter
• Uber
• Airbnb
• Ford Motor company
• Microsoft
24
R
Installati
on

https://cran
.r-project.o
rg/bin/wind
ows/base/

25
R Console Window

26
R Command Prompt

Once you have R environment setup, then it’s easy to start your R

command prompt by just typing the following command.

This will launch R interpreter and you will get a prompt > where you

can start typing your program as follows.

> myString <- "Hello, World!"

> print ( myString)

[1] "Hello, World!"


27
Rstudio (IDE)

https://rstudio.co
m/products/rstud
io/download/#do
wnload

28
R - Data Types
 In contrast to other programming languages like C and java in R, the
variables are not declared as some data type.
 The variables are assigned with R-Objects and the data type of the R-
object becomes the data type of the variable.
 There are many types of R-objects. The frequently used ones are −
 Vectors
 Lists
 Matrices
 Arrays
 Factors
 Data Frames 29
R - Functions
 A function is a set of statements to perform a specific task.
 R has a large number of in-built functions
 The user can create their own functions.
Built-in Function
 Simple examples of in-built functions are seq(), mean(), max(), sum(x) and
paste(...) etc.
 They are directly called by user written programs.
# Create a sequence of numbers from 32 to 44.
print(seq(32,44))
# Find mean of numbers from 25 to 82.
print(mean(25:82))
# Find sum of numbers from 41 to 68.
print(sum(41:68)) 30
User-defined Function
 They are specific to what a user wants and once created they can
be used like the built-in functions.
# Create a function to print squares of numbers in sequence.
new.function <- function(a) { for(i in 1:a) { b <- i^2 print(b) } }
Calling a Function
# Call the function new.function supplying 6 as an argument.
new.function(6)
Produces the following result −
[1] 1 [1] 4 [1] 9 [1] 16 [1] 25 [1] 36
31
R String Manipulation Functions
1. grep()
It is used for pattern matching and replacement.
grep("b+", c("abc", "bda", "ccaa", "abd"), perl=TRUE, value=TRUE)
grep("b+", c("abc", "bda", "ccaa", "abd"), perl=TRUE, value=FALSE)
grep("chid+", c("chidambaram", "Villupuram", "Srimushnam",
"chidambaram"), perl=TRUE, value=FALSE)
grep("அ+", c("அப்பா", "தாத்தா", "அம்மா"), perl=TRUE, value=FALSE)
[1] 1 2 4
[1] 1 4
[1] 1 3 32
2. nchar()
With the help of this function, we can count the characters.
> str <- "Big Data at DataFlair"
> nchar(str)
[21]
3. paste()
Concatenate n number of strings using the paste() function.
> #Author DataFlair
> paste("Hadoop", "Spark", "and", "Flink")
[1] “Hadoop, Spark, and, Flink”
4. sprintf()
This function makes of the formatting commands that are styled after C.
> sprintf("%s scored %.2f percent", "Matthew", 72.3)
> [1] Matthew scored 72.30 percent 33
5. strsplit()
> #Author DataFlair
> str = "Splitting sentence into words"
> strsplit(str, " ")
> strsplit(str, "")

Output
[1] "Splitting" "sentence" "into" "words"

[1] "S" "p" "l" "i" "t" "t" "i" "n" "g" " " "s" "e" "n" "t" "e" "n" "c" "e" " "
"i" "n" "t" "o" " " "w" "o" "r" "d" "s" 34
Vector
Vectors are the most basic R data objects and there are six types of atomic
vectors. They are logical, integer, double, complex, character and raw.

Multiple Elements Vector - Using colon operator with numeric data

# Creating a sequence from 5 to 13.


v <- 5:13
print(v)
[1] 5 6 7 8 9 10 11 12 13
# Creating a sequence from 6.6 to 12.6.
v <- 6.6:12.6
print(v)
[1] 6.6 7.6 8.6 9.6 10.6 11.6 12.6
# If the final element specified does not belong to the sequence then it is
discarded.
v <- 3.8:11.4
print(v)
[1] 3.8 4.8 5.8 6.8 7.8 8.8 9.8 10.8 35
Using sequence (Seq.) operator

# Create vector with elements from 5 to 9 incrementing by 0.4.


print(seq(5, 9, by = 0.4))
[1] 5.0 5.4 5.8 6.2 6.6 7.0 7.4 7.8 8.2 8.6 9.0

Using the c() function

The non-character values are coerced to character type if one of the elements is
a character.

# The logical and numeric values are converted to characters.


s <- c('apple','red',5,TRUE)
print(s)
[1] "apple" "red" "5" "TRUE"

36
Accessing Vector Elements
 The [ ] brackets are used for indexing. Indexing starts with position 1.
 Giving a negative value in the index drops that element from result.
 TRUE, FALSE or 0 and 1 can also be used for indexing.

# Accessing vector elements using position.


t <- c("Sun","Mon","Tue","Wed","Thurs","Fri","Sat")
u <- t[c(2,3,6)]
print(u) [1]
"Mon" "Tue" "Fri"
# Accessing vector elements using logical indexing.
v <- t[c(TRUE,FALSE,FALSE,FALSE,FALSE,TRUE,FALSE)]
print(v) [1]
"Sun" "Fri"
# Accessing vector elements using negative indexing.
x <- t[c(-2,-5)]
print(x) [1]
"Sun" "Tue" "Wed" "Fri" "Sat" 37
R - Lists
 Lists are the R objects which contain elements of different types.
 numbers, strings, vectors and another list inside it.
 List is created using list() function.
Creating a List

# Create a list containing strings, numbers, vectors and a logical values.


list_data <- list("Red", "Green", c(21,32,11), TRUE, 51.23, 119.1)
print(list_data)
[1] "Red"
[1] "Green"
[1] 21 32 11
[1] TRUE
[1] 51.23
38
[1] 119.1
R - Matrices

 Matrices are the R objects in which the elements are arranged in a


dimensional.

Syntax

matrix(data, nrow, ncol, byrow, dimnames)

 data is the input vector which becomes the data elements of the matrix.
 nrow is the number of rows to be created.
 ncol is the number of columns to be created.
 byrow, If TRUE, then the input vector elements are arranged by row.
 dimname is the names assigned to the rows and columns.

39
Matrix Example

# Elements are arranged sequentially by row.


M <- matrix(c(3:14), nrow = 4, byrow = TRUE)
print(M)

# Elements are arranged sequentially by column.


N <- matrix(c(3:14), nrow = 4, byrow = FALSE)
print(N)

# Define the column and row names.


rownames = c("row1", "row2", "row3", "row4")
colnames = c("col1", "col2", "col3")

P <- matrix(c(3:14), nrow = 4, byrow = TRUE,

dimnames = list(rownames, colnames))


print(P)
40
Accessing Elements of a Matrix
Matrix can be accessed by using the column and row index

# Define the column and row names.


rownames = c("row1", "row2", "row3", "row4")
colnames = c("col1", "col2", "col3")

# Create the matrix.


P <- matrix(c(3:14), nrow = 4, byrow = TRUE, dimnames = list(rownames, colnames))

# Access the element at 3rd column and 1st row.


print(P[1,3]) [1] 5

# Access the element at 2nd column and 4th row.


print(P[4,2]) [1] 13

# Access only the 2nd row.


print(P[2,]) col1 col2 col3
6 7 8
# Access only the 3rd column.
print(P[,3]) row1 row2 row3 row4
5 41
8 11
# Example for Matrix # Matrix Multiflication
Matmul<-A*B
function Matmul
A <- matrix(c(1, 2, 3, 4, 5, 6, 7, 8, 9),3,3) # Matrix Transforce
B <- matrix(c(11, 12, 13, 14, 15, 16, 17, 18, 19),3,3) tm<-t(A)
C <- matrix(c(21, 22, 23, 24, 25, 26, 27, 28, 29),3,3)
A
tm
# Check whether the variable A Matrix or not #Computing Column & Row Sums
is.matrix(A) sum(A)
#Multiplication by a Scalar
colSums(A)
rowSums(A)
s<-3 #Computing Column & Row Means
s1<-A*s mean(A)
colMeans(A)
s1
rowMeans(A)
# Matrix Addition #Accessing the matrix element
Matadd<-A+B A
A[2,2]
Matadd
A[2,2]<-33
# Matrix Subtraction A[3,2]+B[2,2]
Matsub<-A-B
42
R - Data Frames

 A data frame is a table or structure.

 Each column contains values of one variable.

Following are the characteristics of a data frame

 The column names should be non-empty.

 The row names should be unique.

 The data stored in a data frame can be of numeric, factor or

character.

 Each column should contain same number of data items. 43


# Create the data frame.
emp.data <- data.frame( emp_id = c (1:5),
emp_name = c("Rick","Dan","Michelle","Ryan","Gary"),
salary = c(623.3,515.2,611.0,729.0,843.25),
start_date = as.Date(c("2012-01-01",
"2013-09-23", "2014-11-15",

"2014-05-11", "2015-03-27")),
stringsAsFactors = FALSE )
# Print the data frame.
print(emp.data)
44
Summary of Data in Data Frame
The statistical summary and nature of the data can be obtained by
applying summary() function.
# Create the data frame.
emp.data <- data.frame( emp_id = c (1:5),
emp_name = c("Rick","Dan","Michelle","Ryan","Gary"),
salary = c(623.3,515.2,611.0,729.0,843.25),
start_date = as.Date(c("2012-01-01", "2013-09-23", "2014-11-15",
"2014-05-11", "2015-03-27")), stringsAsFactors = FALSE )
# Print the summary.
Print(emp.data)
45
print(summary(emp.data))
summary(emp.data)

46
Working with CSV Files
• R read data from files stored outside the R environment.
• Write data into files which will be stored and accessed by the operating
system.
• R can read and write into various file formats like csv, excel, xml etc.
Getting and Setting the Working Directory
getwd() - Find the working directory
setwd() - Set the working directory
# Get and print current working directory.
print(getwd())
# Set current working directory.
setwd("/web/com")
47
# Get and print current working directory.
Input as CSV File
 The csv file is a text file
 The values in the columns are separated by a comma.
 The following data present in the file named input.csv.
id,name,salary,start_date,dept
1,Rick,623.3,2012-01-01,IT
2,Dan,515.2,2013-09-23,Operations
3,Michelle,611,2014-11-15,IT
4,Ryan,729,2014-05-11,HR
5,Gary,843.25,2015-03-27,Finance
6,Nina,578,2013-05-21,IT
7,Simon,632.8,2013-07-30,Operations
48
8,Guru,722.5,2014-06-17,Finance
Reading a CSV File
read.csv()
function to read a CSV file available in your current working directory
data <- read.csv("input.csv")
print(data)
Output

49
Analysing the CSV File
read.csv() function gives the output as a data frame
data <- read.csv("input.csv")
print(is.data.frame(data))
print(ncol(data))
print(nrow(data)) [1] TRUE [1] 5
[1] 8
Get the maximum salary
# Create a data frame.
data <- read.csv("input.csv")
# Get the max salary from data frame.
sal <- max(data$salary)
# Get the person detail having max salary.
retval <- subset(data, salary == max(salary)) 50
51
52
Writing into a CSV File

 R can create csv file form using existing data frame.

 The write.csv() function is used to create the csv file.

 This file gets created in the working directory.

# Create a data frame.

data <- read.csv("input.csv")

retval <- subset(data, as.Date(start_date) > as.Date("2014-01-01"))

# Write filtered data into a new file.

write.csv(retval,"output.csv")

newdata <- read.csv("output.csv")

print(newdata) 53
Linear Regression
Regression analysis is a very widely used statistical tool to establish a relationship
model between two variables.
One of these variable is called predictor variable whose value is gathered through
experiments.
The other variable is called response variable whose value is derived from the predictor
variable.
Mathematically a linear relationship represents a straight line when plotted as a graph.
The general mathematical equation for a linear regression is y = ax + b
Following is the description of the parameters used −
y is the response variable.
x is the predictor variable.
a and b are constants which are called the coefficients.
How much money should you allocate for gas?
You approach this problem with a science-oriented mindset, thinking that there must be a way to estimate the
amount of money needed, based on the distance you're travelling.

At this point these are just numbers. It's not very easy to get
any valuable information from this spreadsheet.

"If I drive for 1200 miles, how much will I pay for gas?"
Sl.No. Total Miles (x) Total Payed (y) x*x x*y
1 390 36.66 152100 14297.4
2 403 37.05 162409 14931.15 a
n x y x y
i i i i

n  x   x 
2 2
3 396.5 34.71 157212.25 13762.52 i i

4 383.5 32.5 147072.25 12463.75


5 321.1 32.63 103105.21 10477.49
6
7
391.3
386.1
34.45
36.79
153115.69
149073.21
13480.29
14204.62 b 
1
n
 yi  a  xi 
8 371.8 37.44 138235.24 13920.19
9 404.3 38.09 163458.49 15399.79
10 392.6 38.09 154134.76 14954.13
11 386.49 38.74 149374.5201 14972.62 y = ax + b
12 395.2 39 156183.04 15412.8
13 385.5 40 148610.25 15420
14 372 36.21 138384 13470.12
15 397 34.05 157609 13517.85
16 407 41.79 165649 17008.53
17 372.33 30.25 138629.6289 11262.98
18 375.6 38.83 141075.36 14584.55
19 399 39.66 159201 15824.34
7330.32 696.94 2834631.899 269365.1
Visualize the Regression Graphically
# Create the predictor and response variable.
x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)
y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)
relation <- lm(y~x)
# Give the chart file a name.
png(file = "linearregression.png")
# Plot the chart.
plot(y,x,col = "blue",main = "Height & Weight Regression",abline(lm(x~y)),cex = 1.3,pch =
16,xlab = "Weight in Kg",ylab = "Height in cm")
# Save the file.
dev.off()
predict() Function
Predict the weight of new persons
# The predictor vector called Height
x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)
# The response vector called Weight
y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)
# Apply the lm() function
relation <- lm(y~x)
# Find weight of a person with height 170.
a <- data.frame(x = 170)
result <- predict(relation,a)
print(result)
When we execute the above code, it produces the following result −
1 76.22869
Multiple Regression
Multiple regression is an extension of linear regression into relationship
between more than two variables.
In simple linear relation we have one predictor and one response
variable.
But in multiple regression we have more than one predictor variable and
one response variable.
The general mathematical equation for multiple regression is −
y = a + b1x1 + b2x2 +...bnxn
Following is the description of the parameters used −
y is the response variable.
a, b1, b2...bn are the coefficients.
x1, x2, ...xn are the predictor variables.
We create the regression model using the lm() function in R. The model
determines the value of the coefficients using the input data.
Next we can predict the value of the response variable for a given set of
predictor variables using these coefficients.
lm() Function
This function creates the relationship model between the predictor
and the response variable.
Syntax
The basic syntax for lm() function in multiple regression is −
lm(y ~ x1+x2+x3...,data)
Following is the description of the parameters used −
• formula is a symbol presenting the relation between the response
variable and predictor variables.
• data is the vector on which the formula will be applied.
Unemployement Dataset
R Script Multiple Regression

# Capture the in R format


Year <-
c(2017,2017,2017,2017,2017,2017,2017,2017,2017,2017,2017,2017,2016,2016,2016,
2016,2016,2016,2016,2016,2016,2016,2016,2016)

Month <- c(12, 11,10,9,8,7,6,5,4,3,2,1,12,11,10,9,8,7,6,5,4,3,2,1)

Interest_Rate <-
c(2.75,2.5,2.5,2.5,2.5,2.5,2.5,2.25,2.25,2.25,2,2,2,1.75,1.75,1.75,1.75,1.75,1.75,1.75,1.75,
1.75,1.75,1.75)

Unemployment_Rate <-
c(5.3,5.3,5.3,5.3,5.4,5.6,5.5,5.5,5.5,5.6,5.7,5.9,6,5.9,5.8,6.1,6.2,6.1,6.1,6.1,5.9,6.2,6.2,6.1)

Stock_Index_Price <-
c(1464,1394,1357,1293,1256,1254,1234,1195,1159,1167,1130,1075,1047,965,943,958,97
1,949,884,866,876,822,704,719)
# Check the Linearity the corresponding data is correct or not
plot(x=Interest_Rate, y=Stock_Index_Price)
plot(x=Unemployment_Rate, y=Stock_Index_Price)
# Capture the in R format
student <- c(1,2,3,4,5,6,7,8,9,10)
testscore <- c(100,95,92,90,85,80,78,75,72,65)
IQ <- c(125,104,110,105,100,100,95,95,85,90)
studyhrs <- c(30,40,25,20,20,20,15,10,0,5)
# Check the Linearity the corresponding data is correct or not
plot(x=testscore, y=IQ)
plot(x=IQ, y=studyhrs)
#==================================================
# Predict Test Square using IQ and Study Hrs
relation <- lm(testscore ~ IQ + studyhrs)
a <- data.frame(IQ=120,studyhrs=40)
result <- predict(relation,a)
print(result)
#==================================================
# Predict IQ using Test Square and Study Hrs
relation <- lm(IQ ~ testscore + studyhrs)
a <- data.frame(testscore=50,studyhrs=25)
result <- predict(relation,a)
print(result)
#==================================================
# Predict IQ using Test Square and Study Hrs
relation <- lm(studyhrs ~ IQ + testscore )
a <- data.frame(IQ=140, testscore=90)
result <- predict(relation,a)
print(result)
68
# Create data for the graph.
x <- c(21, 62, 10, 53)
labels <- c("London", "New York", "Singapore", "Mumbai")
# Give the chart file a name.
png(file = "city.png")
# Plot the chart.
pie(x,labels)
# Save the file.
dev.off()

69
# Get the library.
library(plotrix)
# Create data for the graph.
x <- c(21, 62, 10,53)
lbl <- c("Nashik","Aurangabad","Navi Mumbai","Nagpur")
png(file = "3d_pie_chart.png")
# Plot the chart.
pie3D(x,labels = lbl,explode = 0.1, main = "Pie Chart of
Countries ")
dev.off()
70
71
Bar Chart with Color with attributes
# Create the data for the chart.
H <- c(7,12,28,3,41)
M <- c("Mar","Apr","May","Jun","Jul")
# Give the chart file a name.
png(file = "barchart_months_revenue.png")
# Plot the bar chart.
barplot(H,names.arg = M,xlab = "Month",ylab =
"Revenue",col = "blue", main = "Revenue chart",border
= "red")
dev.off()

72
Bar chart – Stacked

colors <- c("green","orange","brown")


months <- c("Mar","Apr","May","Jun","Jul")
regions <- c("East","West","North")
Values <- matrix(c(2,9,3,11,9,4,8,7,3,12,5,2,8,10,11),nrow =
3,ncol = 5,byrow = TRUE)
png(file = "barchart_stacked.png")
barplot(Values,main = "total revenue",names.arg = months,xlab =
"month",ylab = "revenue", col = colors)
legend("topleft", regions, cex = 1.3, fill = colors)
dev.off()

73
Box Plot Graphs
# Give the chart file a name.
png(file = "boxplot.png")
# Plot the chart.
boxplot(mpg ~ cyl, data = mtcars, xlab =
"Number of Cylinders", ylab = "Miles
Per Gallon", main = "Mileage Data")
# Save the file.
dev.off()

74
Histogram – Example
# Create data for the graph.
v <- c(9,13,21,8,36,22,12,41,31,33,19)
# Give the chart file a name.
png(file = "histogram.png")
# Create the histogram.
hist(v,xlab = "Weight",col = "yellow",border = "blue")
# Save the file.
dev.off()

75
barplot
X<-c(0,1,2,3)
Prob<-c(0.208,0.167,0.25,0.375)
N<-c('A','B','C','D')
barplot(Prob,names=N,ylab="Probability", main="RNA Residue Analysis")
77

You might also like