0% found this document useful (0 votes)

10 views77 pages

DataAnalyticsUsingR Dr.P.rajesh

The document outlines a national level webinar on Data Analytics using R programming, emphasizing the importance of data analytics in decision-making for businesses. It details various types of data analytics, including descriptive, diagnostic, predictive, and prescriptive analytics, along with the features and advantages of using R programming for data analysis. Additionally, it covers the applications of R in different industries and provides insights into RStudio as an integrated development environment for R.

Uploaded by

rishikareddy713

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views77 pages

DataAnalyticsUsingR Dr.P.rajesh

Uploaded by

rishikareddy713

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 77

One Day National Level Webinar on

Data Analytics
using R
Programming
Dr. P. Rajesh
Assistant Professor
PG Department of Computer Science
Government Arts College
C.Mutlur, Chidambaram.
Email: [email protected]

Arignar Anna Government Arts College, Villupuram. Date 22.07.2020, Time 10.00 am to 12.00
pm
Data Analytics
 Data Science and Data Analytics are two most trending terminologies
of today’s time.
 Data is collected into raw form and processed according to the
requirement of a company.
 This data is utilized for the decision making purpose.
 This process helps the businesses to grow in the market.
 But, the main question arises – What is the process called?
 Data Analytics is the answer here. and, Data Analyst and Data
Scientist are the ones who perform this process.
What is Data Analytics?
 Data or information is in raw format.
 The increase in size of the data has led to arise
 In need for carrying out inspection, data cleaning and
transformation.
 Data modeling to gain insights from the data in order to derive
conclusions for better decision-making process.
 This process is known as data analysis.
 The analysis is an interactive process of a person tackling a
problem, finding the data required to get an answer, analyzing that
data, and interpreting the results in order to provide a
https://www.kdnuggets.com/2017/07/4-types-data-analytics.html
1. Descriptive:
What is happening?

 The help of descriptive analysis, we

analyze and describe the features of a
data.
 It deals with the summarization of
information.
 In the descriptive analysis, we deal
with the past data to draw conclusions
and present our data in the form of
actionable results.
2. Diagnostic:

Why is it happening?
 Diagnostic analytics is a form of
advanced analytics that examines
data or content to answer the
question,
 It is characterized by techniques
such as data discovery, data mining
and correlations.
3. Predictive: What is likely to
happen?
 With the help of predictive analysis,
determine the future outcome.
 Based on the analysis of the historical
data, we are able to forecast the future.
 With the help of data analytics,
technological advancements and machine
learning, we are able to obtain predictive
about the future effectively.
 Predictive analytics is a complex field that
requires a large amount of data, skilled
implementation of predictive models.
 Its tuning to obtain accurate predictions.
4. Prescriptive:
What do I need to do?
 Understanding of
 what has happened,
 why it has happened.
 variety of “what-might-
happen” analysis.
 help the user determine the
best solutions of action to
take.
 Prescriptive analysis is
typically not just with one
individual action but is in
fact a host of other actions.
 Best route home and
considering the distance of
each route, the speed
What is the types of Data in Data Analytics
History
History
R is a programming language and software environment for
statistical analysis, graphics representation and reporting.
R was created by Ross Ihaka and Robert Gentleman at the
University of Auckland, New Zealand.
R is freely available under the GNU General Public License.
R provided for various operating systems like Linux, Windows and
Mac.
This programming language was named R, based on the first letter
of first name of the two R authors (Robert Gentleman and Ross
Ihaka).
 R play on the name of the Bell Labs Language S.

 R was initially written at the Department of Statistics of the

University of Auckland in Auckland, New Zealand.

 R made its first appearance in 1993.

 Since mid-1997 there has been a core group (the "R Core Team")

who can modify the R source code archive.

14
Why Learn R Programming Language
 With R, you can perform statistical analysis, data analysis as well as machine
learning.
 We can create objects, functions and packages in it.
 R is platform-independent and can be used across multiple operating systems.
 R is free owing to its open-source GNU licensing and can be installed by
anyone.
 R consists of a robust collection of graphical libraries like ggplot2, plotly and
many more.
 R is most widely used by the various industries like health, finance, banking,
manufacturing and many more.
 There are about 2 million job openings for R programmers worldwide.
 Companies hire R programmers for many roles like data analysts, business
15
16
Features of R
 As stated earlier, R is a programming language and software
environment for statistical analysis, graphics representation and
reporting. The following are the important features of R −
 R is a well-developed, simple and effective programming language
which includes conditionals, loops, user defined recursive
functions and input and output facilities.
 R has an effective data handling and storage facility,
 R provides a suite of operators for calculations on arrays, lists,
vectors and matrices.
 R provides a large, coherent and integrated collection of tools for
data analysis. 17
How R is better than Other Technologies
There are certain unique aspects of R programming which makes it better in
comparison with other technologies:
•Graphical Libraries – Libraries like ggplot2, plotly facilitate appealing libraries for
making well-defined plots.
•Availability / Cost – R is completely free.
•Advancement in Tool – R supports various advanced tools and features that allow
you to build robust statistical models.
•Job Scenario – The immense growth in Data Science and rise in demand, R has
become the most in-demand programming language of the world today.
•Customer Service Support and Community – With R, you can enjoy strong
community support.
•Portability – R is highly portable. Many different programming languages and
software frameworks can easily combine with the R environment for the best results.
18
Sourcing of R Script
RStudio
• RStudio is an Integrated Development Environment for R.
• It facilitates extensive code editing, development as well as various
features.
Features of RStudio
• RStudio provides various tools and features that allow you to boost your
code productivity.
• It can also be accessed over the web and is cross-platform in nature.
• It facilitates automatic checking of updates
• It provides support for recovery in case of file loss.
• With RStudio, you can manage the data more efficiently. 19
Components of RStudio

• Source – In the top left corner of the screen is the text editor that

allows you to work within source scripting. You can enter multiple lines

in this source.

• Console – This is present on the bottom left corner of the main window

of R Studio. It facilitates interactive scripting in R.

Workspace and History – In the top right corner, the R workspace and

the history window. This will give you the list of all the variables and view

the list of past commands that were executed by R. 20

21
22
Applications of R Programming
 finance and banking sectors for detecting fraud, reducing customer churn
rate and for making future decisions.
 bioinformatics to analyze strands of genetic sequences, for performing
drug discovery and also in computational neuroscience.
 Social media analysis to discover potential customers in online advertising.
 Companies also use social media information to analyze customer
sentiments for making their products better.
 E-Commerce companies make use of R to analyze the purchases made by
the customers as well as their feedbacks.
 Manufacturing companies use R to analyze customer feedback.
 They also use it to predict future demand to adjust their manufacturing
23
speeds and maximize profits.
Companies Using R
Some of the companies that are using R programming are as follows:
• Facebook
• Google
• Linkedin
• IBM
• Twitter
• Uber
• Airbnb
• Ford Motor company
• Microsoft
24
R
Installati
on

https://cran
.r-project.o
rg/bin/wind
ows/base/

25
R Console Window

26
R Command Prompt

Once you have R environment setup, then it’s easy to start your R

command prompt by just typing the following command.

This will launch R interpreter and you will get a prompt > where you

can start typing your program as follows.

> myString <- "Hello, World!"

> print ( myString)

[1] "Hello, World!"

27
Rstudio (IDE)

https://rstudio.co
m/products/rstud
io/download/#do
wnload

28
R - Data Types
 In contrast to other programming languages like C and java in R, the
variables are not declared as some data type.
 The variables are assigned with R-Objects and the data type of the R-
object becomes the data type of the variable.
 There are many types of R-objects. The frequently used ones are −
 Vectors
 Lists
 Matrices
 Arrays
 Factors
 Data Frames 29
R - Functions
 A function is a set of statements to perform a specific task.
 R has a large number of in-built functions
 The user can create their own functions.
Built-in Function
 Simple examples of in-built functions are seq(), mean(), max(), sum(x) and
paste(...) etc.
 They are directly called by user written programs.
# Create a sequence of numbers from 32 to 44.
print(seq(32,44))
# Find mean of numbers from 25 to 82.
print(mean(25:82))
# Find sum of numbers from 41 to 68.
print(sum(41:68)) 30
User-defined Function
 They are specific to what a user wants and once created they can
be used like the built-in functions.
# Create a function to print squares of numbers in sequence.
new.function <- function(a) { for(i in 1:a) { b <- i^2 print(b) } }
Calling a Function
# Call the function new.function supplying 6 as an argument.
new.function(6)
Produces the following result −
[1] 1 [1] 4 [1] 9 [1] 16 [1] 25 [1] 36
31
R String Manipulation Functions
1. grep()
It is used for pattern matching and replacement.
grep("b+", c("abc", "bda", "ccaa", "abd"), perl=TRUE, value=TRUE)
grep("b+", c("abc", "bda", "ccaa", "abd"), perl=TRUE, value=FALSE)
grep("chid+", c("chidambaram", "Villupuram", "Srimushnam",
"chidambaram"), perl=TRUE, value=FALSE)
grep("அ+", c("அப்பா", "தாத்தா", "அம்மா"), perl=TRUE, value=FALSE)
[1] 1 2 4
[1] 1 4
[1] 1 3 32
2. nchar()
With the help of this function, we can count the characters.
> str <- "Big Data at DataFlair"
> nchar(str)
[21]
3. paste()
Concatenate n number of strings using the paste() function.
> #Author DataFlair
> paste("Hadoop", "Spark", "and", "Flink")
[1] “Hadoop, Spark, and, Flink”
4. sprintf()
This function makes of the formatting commands that are styled after C.
> sprintf("%s scored %.2f percent", "Matthew", 72.3)
> [1] Matthew scored 72.30 percent 33
5. strsplit()
> #Author DataFlair
> str = "Splitting sentence into words"
> strsplit(str, " ")
> strsplit(str, "")

Output
[1] "Splitting" "sentence" "into" "words"

[1] "S" "p" "l" "i" "t" "t" "i" "n" "g" " " "s" "e" "n" "t" "e" "n" "c" "e" " "
"i" "n" "t" "o" " " "w" "o" "r" "d" "s" 34
Vector
Vectors are the most basic R data objects and there are six types of atomic
vectors. They are logical, integer, double, complex, character and raw.

Multiple Elements Vector - Using colon operator with numeric data

# Creating a sequence from 5 to 13.

v <- 5:13
print(v)
[1] 5 6 7 8 9 10 11 12 13
# Creating a sequence from 6.6 to 12.6.
v <- 6.6:12.6
print(v)
[1] 6.6 7.6 8.6 9.6 10.6 11.6 12.6
# If the final element specified does not belong to the sequence then it is
discarded.
v <- 3.8:11.4
print(v)
[1] 3.8 4.8 5.8 6.8 7.8 8.8 9.8 10.8 35
Using sequence (Seq.) operator

# Create vector with elements from 5 to 9 incrementing by 0.4.

print(seq(5, 9, by = 0.4))
[1] 5.0 5.4 5.8 6.2 6.6 7.0 7.4 7.8 8.2 8.6 9.0

Using the c() function

The non-character values are coerced to character type if one of the elements is
a character.

# The logical and numeric values are converted to characters.

s <- c('apple','red',5,TRUE)
print(s)
[1] "apple" "red" "5" "TRUE"

36
Accessing Vector Elements
 The [ ] brackets are used for indexing. Indexing starts with position 1.
 Giving a negative value in the index drops that element from result.
 TRUE, FALSE or 0 and 1 can also be used for indexing.

# Accessing vector elements using position.

t <- c("Sun","Mon","Tue","Wed","Thurs","Fri","Sat")
u <- t[c(2,3,6)]
print(u) [1]
"Mon" "Tue" "Fri"
# Accessing vector elements using logical indexing.
v <- t[c(TRUE,FALSE,FALSE,FALSE,FALSE,TRUE,FALSE)]
print(v) [1]
"Sun" "Fri"
# Accessing vector elements using negative indexing.
x <- t[c(-2,-5)]
print(x) [1]
"Sun" "Tue" "Wed" "Fri" "Sat" 37
R - Lists
 Lists are the R objects which contain elements of different types.
 numbers, strings, vectors and another list inside it.
 List is created using list() function.
Creating a List

# Create a list containing strings, numbers, vectors and a logical values.

list_data <- list("Red", "Green", c(21,32,11), TRUE, 51.23, 119.1)
print(list_data)
[1] "Red"
[1] "Green"
[1] 21 32 11
[1] TRUE
[1] 51.23
38
[1] 119.1
R - Matrices

 Matrices are the R objects in which the elements are arranged in a

dimensional.

Syntax

matrix(data, nrow, ncol, byrow, dimnames)

 data is the input vector which becomes the data elements of the matrix.
 nrow is the number of rows to be created.
 ncol is the number of columns to be created.
 byrow, If TRUE, then the input vector elements are arranged by row.
 dimname is the names assigned to the rows and columns.

39
Matrix Example

# Elements are arranged sequentially by row.

M <- matrix(c(3:14), nrow = 4, byrow = TRUE)
print(M)

# Elements are arranged sequentially by column.

N <- matrix(c(3:14), nrow = 4, byrow = FALSE)
print(N)

# Define the column and row names.

rownames = c("row1", "row2", "row3", "row4")
colnames = c("col1", "col2", "col3")

P <- matrix(c(3:14), nrow = 4, byrow = TRUE,

dimnames = list(rownames, colnames))

print(P)
40
Accessing Elements of a Matrix
Matrix can be accessed by using the column and row index

# Define the column and row names.

rownames = c("row1", "row2", "row3", "row4")
colnames = c("col1", "col2", "col3")

# Create the matrix.

P <- matrix(c(3:14), nrow = 4, byrow = TRUE, dimnames = list(rownames, colnames))

# Access the element at 3rd column and 1st row.

print(P[1,3]) [1] 5

# Access the element at 2nd column and 4th row.

print(P[4,2]) [1] 13

# Access only the 2nd row.

print(P[2,]) col1 col2 col3
6 7 8
# Access only the 3rd column.
print(P[,3]) row1 row2 row3 row4
5 41
8 11
# Example for Matrix # Matrix Multiflication
Matmul<-A*B
function Matmul
A <- matrix(c(1, 2, 3, 4, 5, 6, 7, 8, 9),3,3) # Matrix Transforce
B <- matrix(c(11, 12, 13, 14, 15, 16, 17, 18, 19),3,3) tm<-t(A)
C <- matrix(c(21, 22, 23, 24, 25, 26, 27, 28, 29),3,3)
A
tm
# Check whether the variable A Matrix or not #Computing Column & Row Sums
is.matrix(A) sum(A)
#Multiplication by a Scalar
colSums(A)
rowSums(A)
s<-3 #Computing Column & Row Means
s1<-A*s mean(A)
colMeans(A)
s1
rowMeans(A)
# Matrix Addition #Accessing the matrix element
Matadd<-A+B A
A[2,2]
Matadd
A[2,2]<-33
# Matrix Subtraction A[3,2]+B[2,2]
Matsub<-A-B
42
R - Data Frames

 A data frame is a table or structure.

 Each column contains values of one variable.

Following are the characteristics of a data frame

 The column names should be non-empty.

 The row names should be unique.

 The data stored in a data frame can be of numeric, factor or

character.

 Each column should contain same number of data items. 43

# Create the data frame.
emp.data <- data.frame( emp_id = c (1:5),
emp_name = c("Rick","Dan","Michelle","Ryan","Gary"),
salary = c(623.3,515.2,611.0,729.0,843.25),
start_date = as.Date(c("2012-01-01",
"2013-09-23", "2014-11-15",

"2014-05-11", "2015-03-27")),
stringsAsFactors = FALSE )
# Print the data frame.
print(emp.data)
44
Summary of Data in Data Frame
The statistical summary and nature of the data can be obtained by
applying summary() function.
# Create the data frame.
emp.data <- data.frame( emp_id = c (1:5),
emp_name = c("Rick","Dan","Michelle","Ryan","Gary"),
salary = c(623.3,515.2,611.0,729.0,843.25),
start_date = as.Date(c("2012-01-01", "2013-09-23", "2014-11-15",
"2014-05-11", "2015-03-27")), stringsAsFactors = FALSE )
# Print the summary.
Print(emp.data)
45
print(summary(emp.data))
summary(emp.data)

46
Working with CSV Files
• R read data from files stored outside the R environment.
• Write data into files which will be stored and accessed by the operating
system.
• R can read and write into various file formats like csv, excel, xml etc.
Getting and Setting the Working Directory
getwd() - Find the working directory
setwd() - Set the working directory
# Get and print current working directory.
print(getwd())
# Set current working directory.
setwd("/web/com")
47
# Get and print current working directory.
Input as CSV File
 The csv file is a text file
 The values in the columns are separated by a comma.
 The following data present in the file named input.csv.
id,name,salary,start_date,dept
1,Rick,623.3,2012-01-01,IT
2,Dan,515.2,2013-09-23,Operations
3,Michelle,611,2014-11-15,IT
4,Ryan,729,2014-05-11,HR
5,Gary,843.25,2015-03-27,Finance
6,Nina,578,2013-05-21,IT
7,Simon,632.8,2013-07-30,Operations
48
8,Guru,722.5,2014-06-17,Finance
Reading a CSV File
read.csv()
function to read a CSV file available in your current working directory
data <- read.csv("input.csv")
print(data)
Output

49
Analysing the CSV File
read.csv() function gives the output as a data frame
data <- read.csv("input.csv")
print(is.data.frame(data))
print(ncol(data))
print(nrow(data)) [1] TRUE [1] 5
[1] 8
Get the maximum salary
# Create a data frame.
data <- read.csv("input.csv")
# Get the max salary from data frame.
sal <- max(data$salary)
# Get the person detail having max salary.
retval <- subset(data, salary == max(salary)) 50
51
52
Writing into a CSV File

 R can create csv file form using existing data frame.

 The write.csv() function is used to create the csv file.

 This file gets created in the working directory.

# Create a data frame.

data <- read.csv("input.csv")

retval <- subset(data, as.Date(start_date) > as.Date("2014-01-01"))

# Write filtered data into a new file.

write.csv(retval,"output.csv")

newdata <- read.csv("output.csv")

print(newdata) 53
Linear Regression
Regression analysis is a very widely used statistical tool to establish a relationship
model between two variables.
One of these variable is called predictor variable whose value is gathered through
experiments.
The other variable is called response variable whose value is derived from the predictor
variable.
Mathematically a linear relationship represents a straight line when plotted as a graph.
The general mathematical equation for a linear regression is y = ax + b
Following is the description of the parameters used −
y is the response variable.
x is the predictor variable.
a and b are constants which are called the coefficients.
How much money should you allocate for gas?
You approach this problem with a science-oriented mindset, thinking that there must be a way to estimate the
amount of money needed, based on the distance you're travelling.

At this point these are just numbers. It's not very easy to get
any valuable information from this spreadsheet.

"If I drive for 1200 miles, how much will I pay for gas?"
Sl.No. Total Miles (x) Total Payed (y) x*x x*y
1 390 36.66 152100 14297.4
2 403 37.05 162409 14931.15 a
n x y x y
i i i i

n  x   x 
2 2
3 396.5 34.71 157212.25 13762.52 i i

4 383.5 32.5 147072.25 12463.75

5 321.1 32.63 103105.21 10477.49
6
7
391.3
386.1
34.45
36.79
153115.69
149073.21
13480.29
14204.62 b 
1
n
 yi  a  xi 
8 371.8 37.44 138235.24 13920.19
9 404.3 38.09 163458.49 15399.79
10 392.6 38.09 154134.76 14954.13
11 386.49 38.74 149374.5201 14972.62 y = ax + b
12 395.2 39 156183.04 15412.8
13 385.5 40 148610.25 15420
14 372 36.21 138384 13470.12
15 397 34.05 157609 13517.85
16 407 41.79 165649 17008.53
17 372.33 30.25 138629.6289 11262.98
18 375.6 38.83 141075.36 14584.55
19 399 39.66 159201 15824.34
7330.32 696.94 2834631.899 269365.1
Visualize the Regression Graphically
# Create the predictor and response variable.
x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)
y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)
relation <- lm(y~x)
# Give the chart file a name.
png(file = "linearregression.png")
# Plot the chart.
plot(y,x,col = "blue",main = "Height & Weight Regression",abline(lm(x~y)),cex = 1.3,pch =
16,xlab = "Weight in Kg",ylab = "Height in cm")
# Save the file.
dev.off()
predict() Function
Predict the weight of new persons
# The predictor vector called Height
x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)
# The response vector called Weight
y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)
# Apply the lm() function
relation <- lm(y~x)
# Find weight of a person with height 170.
a <- data.frame(x = 170)
result <- predict(relation,a)
print(result)
When we execute the above code, it produces the following result −
1 76.22869
Multiple Regression
Multiple regression is an extension of linear regression into relationship
between more than two variables.
In simple linear relation we have one predictor and one response
variable.
But in multiple regression we have more than one predictor variable and
one response variable.
The general mathematical equation for multiple regression is −
y = a + b1x1 + b2x2 +...bnxn
Following is the description of the parameters used −
y is the response variable.
a, b1, b2...bn are the coefficients.
x1, x2, ...xn are the predictor variables.
We create the regression model using the lm() function in R. The model
determines the value of the coefficients using the input data.
Next we can predict the value of the response variable for a given set of
predictor variables using these coefficients.
lm() Function
This function creates the relationship model between the predictor
and the response variable.
Syntax
The basic syntax for lm() function in multiple regression is −
lm(y ~ x1+x2+x3...,data)
Following is the description of the parameters used −
• formula is a symbol presenting the relation between the response
variable and predictor variables.
• data is the vector on which the formula will be applied.
Unemployement Dataset
R Script Multiple Regression

# Capture the in R format

Year <-
c(2017,2017,2017,2017,2017,2017,2017,2017,2017,2017,2017,2017,2016,2016,2016,
2016,2016,2016,2016,2016,2016,2016,2016,2016)

Month <- c(12, 11,10,9,8,7,6,5,4,3,2,1,12,11,10,9,8,7,6,5,4,3,2,1)

Interest_Rate <-
c(2.75,2.5,2.5,2.5,2.5,2.5,2.5,2.25,2.25,2.25,2,2,2,1.75,1.75,1.75,1.75,1.75,1.75,1.75,1.75,
1.75,1.75,1.75)

Unemployment_Rate <-
c(5.3,5.3,5.3,5.3,5.4,5.6,5.5,5.5,5.5,5.6,5.7,5.9,6,5.9,5.8,6.1,6.2,6.1,6.1,6.1,5.9,6.2,6.2,6.1)

Stock_Index_Price <-
c(1464,1394,1357,1293,1256,1254,1234,1195,1159,1167,1130,1075,1047,965,943,958,97
1,949,884,866,876,822,704,719)
# Check the Linearity the corresponding data is correct or not
plot(x=Interest_Rate, y=Stock_Index_Price)
plot(x=Unemployment_Rate, y=Stock_Index_Price)
# Capture the in R format
student <- c(1,2,3,4,5,6,7,8,9,10)
testscore <- c(100,95,92,90,85,80,78,75,72,65)
IQ <- c(125,104,110,105,100,100,95,95,85,90)
studyhrs <- c(30,40,25,20,20,20,15,10,0,5)
# Check the Linearity the corresponding data is correct or not
plot(x=testscore, y=IQ)
plot(x=IQ, y=studyhrs)
#==================================================
# Predict Test Square using IQ and Study Hrs
relation <- lm(testscore ~ IQ + studyhrs)
a <- data.frame(IQ=120,studyhrs=40)
result <- predict(relation,a)
print(result)
#==================================================
# Predict IQ using Test Square and Study Hrs
relation <- lm(IQ ~ testscore + studyhrs)
a <- data.frame(testscore=50,studyhrs=25)
result <- predict(relation,a)
print(result)
#==================================================
# Predict IQ using Test Square and Study Hrs
relation <- lm(studyhrs ~ IQ + testscore )
a <- data.frame(IQ=140, testscore=90)
result <- predict(relation,a)
print(result)
68
# Create data for the graph.
x <- c(21, 62, 10, 53)
labels <- c("London", "New York", "Singapore", "Mumbai")
# Give the chart file a name.
png(file = "city.png")
# Plot the chart.
pie(x,labels)
# Save the file.
dev.off()

69
# Get the library.
library(plotrix)
# Create data for the graph.
x <- c(21, 62, 10,53)
lbl <- c("Nashik","Aurangabad","Navi Mumbai","Nagpur")
png(file = "3d_pie_chart.png")
# Plot the chart.
pie3D(x,labels = lbl,explode = 0.1, main = "Pie Chart of
Countries ")
dev.off()
70
71
Bar Chart with Color with attributes
# Create the data for the chart.
H <- c(7,12,28,3,41)
M <- c("Mar","Apr","May","Jun","Jul")
# Give the chart file a name.
png(file = "barchart_months_revenue.png")
# Plot the bar chart.
barplot(H,names.arg = M,xlab = "Month",ylab =
"Revenue",col = "blue", main = "Revenue chart",border
= "red")
dev.off()

72
Bar chart – Stacked

colors <- c("green","orange","brown")

months <- c("Mar","Apr","May","Jun","Jul")
regions <- c("East","West","North")
Values <- matrix(c(2,9,3,11,9,4,8,7,3,12,5,2,8,10,11),nrow =
3,ncol = 5,byrow = TRUE)
png(file = "barchart_stacked.png")
barplot(Values,main = "total revenue",names.arg = months,xlab =
"month",ylab = "revenue", col = colors)
legend("topleft", regions, cex = 1.3, fill = colors)
dev.off()

73
Box Plot Graphs
# Give the chart file a name.
png(file = "boxplot.png")
# Plot the chart.
boxplot(mpg ~ cyl, data = mtcars, xlab =
"Number of Cylinders", ylab = "Miles
Per Gallon", main = "Mileage Data")
# Save the file.
dev.off()

74
Histogram – Example
# Create data for the graph.
v <- c(9,13,21,8,36,22,12,41,31,33,19)
# Give the chart file a name.
png(file = "histogram.png")
# Create the histogram.
hist(v,xlab = "Weight",col = "yellow",border = "blue")
# Save the file.
dev.off()

75
barplot
X<-c(0,1,2,3)
Prob<-c(0.208,0.167,0.25,0.375)
N<-c('A','B','C','D')
barplot(Prob,names=N,ylab="Probability", main="RNA Residue Analysis")
77

MSBIM 2021 ArchiCAD Professional Template
100% (2)
MSBIM 2021 ArchiCAD Professional Template
22 pages
Assignment 1: 11520G Data Capture and Preparations
No ratings yet
Assignment 1: 11520G Data Capture and Preparations
7 pages
R Programming First Unit
100% (1)
R Programming First Unit
34 pages
DriveLock Admin Guide
No ratings yet
DriveLock Admin Guide
566 pages
Nombre de KPIs
No ratings yet
Nombre de KPIs
13 pages
Tybsc It 26072019
No ratings yet
Tybsc It 26072019
91 pages
Akka Scala
No ratings yet
Akka Scala
399 pages
Client Onboarding Document 1
No ratings yet
Client Onboarding Document 1
3 pages
Annexure-10b) MA5620 & MA5626 Product Description PDF
No ratings yet
Annexure-10b) MA5620 & MA5626 Product Description PDF
52 pages
User's Manual Live
No ratings yet
User's Manual Live
94 pages
Network Media Cables
No ratings yet
Network Media Cables
29 pages
AZ-104 Exam - Free Actual Q&As, Page 3 ExamTopics
No ratings yet
AZ-104 Exam - Free Actual Q&As, Page 3 ExamTopics
7 pages
Translating Subtitles: CAT Week-8 Spring 2019 Wei, LIU
No ratings yet
Translating Subtitles: CAT Week-8 Spring 2019 Wei, LIU
14 pages
Intrusion - Brosura SI410
No ratings yet
Intrusion - Brosura SI410
14 pages
What Is The Secure Software Development Life Cycle - Synopsys
No ratings yet
What Is The Secure Software Development Life Cycle - Synopsys
2 pages
A Crash R Course On Statistical Graphics
No ratings yet
A Crash R Course On Statistical Graphics
169 pages
AI Important Ques Ans
No ratings yet
AI Important Ques Ans
11 pages
Tours Csharp Project Proposal
No ratings yet
Tours Csharp Project Proposal
2 pages
DataAnalytics Using R
No ratings yet
DataAnalytics Using R
101 pages
A Concise Tutorial On R
No ratings yet
A Concise Tutorial On R
112 pages
STS Installation Instructions 2.7.1.RELEASE
No ratings yet
STS Installation Instructions 2.7.1.RELEASE
15 pages
Introduction To R
No ratings yet
Introduction To R
67 pages
13.feature Usage Card - FLP
No ratings yet
13.feature Usage Card - FLP
3 pages
HBE Green LED
No ratings yet
HBE Green LED
4 pages
San Francisco St. Butuan City 8600, Region XIII Caraga, Philippines
No ratings yet
San Francisco St. Butuan City 8600, Region XIII Caraga, Philippines
2 pages
Data Analysis Using R
100% (1)
Data Analysis Using R
78 pages
Inheritance
No ratings yet
Inheritance
23 pages
Six Major Types of Information Systems - Management Study HQ
No ratings yet
Six Major Types of Information Systems - Management Study HQ
5 pages
R Programming in Statistics
No ratings yet
R Programming in Statistics
403 pages
Assignment For MCA 3rd Sem HPU R Programming
No ratings yet
Assignment For MCA 3rd Sem HPU R Programming
31 pages
Statistical Methods Lab Manual-2021-22
No ratings yet
Statistical Methods Lab Manual-2021-22
58 pages
R Material
No ratings yet
R Material
105 pages
Introduction To R Programming Notes For Students
No ratings yet
Introduction To R Programming Notes For Students
41 pages
R Programming Tech
No ratings yet
R Programming Tech
19 pages
DataAnalytics Using R
No ratings yet
DataAnalytics Using R
102 pages
The Basics of The R Programming Language
No ratings yet
The Basics of The R Programming Language
21 pages
TT Study Guide Organized-2
No ratings yet
TT Study Guide Organized-2
14 pages
R Language 1st Unit Deep
100% (3)
R Language 1st Unit Deep
61 pages
SC&RP - Unit 1
No ratings yet
SC&RP - Unit 1
106 pages
R Programming
No ratings yet
R Programming
11 pages
OpenShift Container Platform 4.17 Disconnected Environments
No ratings yet
OpenShift Container Platform 4.17 Disconnected Environments
165 pages
What Is R Programming
No ratings yet
What Is R Programming
7 pages
1.R Unit 1
No ratings yet
1.R Unit 1
49 pages
R Lang
No ratings yet
R Lang
3 pages
R Programming
No ratings yet
R Programming
11 pages
Lab 01
No ratings yet
Lab 01
11 pages
Internship
No ratings yet
Internship
14 pages
R Language
No ratings yet
R Language
59 pages
Chapter-1:-Introduction To R Language: 1.1 History and Overview
No ratings yet
Chapter-1:-Introduction To R Language: 1.1 History and Overview
7 pages
Introduction To R Programming
No ratings yet
Introduction To R Programming
23 pages
Edar M-1
No ratings yet
Edar M-1
46 pages
Basic+R Course
No ratings yet
Basic+R Course
30 pages
R Tutiorial
No ratings yet
R Tutiorial
6 pages
ITR Front Pages
100% (1)
ITR Front Pages
5 pages
R Programming Language - 2020 Edition
No ratings yet
R Programming Language - 2020 Edition
228 pages
Unit 5 R
No ratings yet
Unit 5 R
51 pages
CH02 Introduction To R
No ratings yet
CH02 Introduction To R
22 pages
Unit 1
No ratings yet
Unit 1
22 pages
Endpoint Protector 5 JAMF Deployment User Manual EN
No ratings yet
Endpoint Protector 5 JAMF Deployment User Manual EN
11 pages
Statistical Computing & R Programming Notes PDF
100% (2)
Statistical Computing & R Programming Notes PDF
22 pages
CRM Practices of Amazon
No ratings yet
CRM Practices of Amazon
45 pages
Pplpresentation 211012192639
No ratings yet
Pplpresentation 211012192639
35 pages
R Programming R Basics For Beginners. (Z-Library)
No ratings yet
R Programming R Basics For Beginners. (Z-Library)
177 pages
AICTE Internship 2024 Project Report Template 2
No ratings yet
AICTE Internship 2024 Project Report Template 2
12 pages
Introduction R
No ratings yet
Introduction R
20 pages
Unit 1 - Data Analysis Using R
No ratings yet
Unit 1 - Data Analysis Using R
28 pages
R Manual
No ratings yet
R Manual
84 pages
Unit1 Introduction To R Programming
No ratings yet
Unit1 Introduction To R Programming
85 pages
R Programming Lab
No ratings yet
R Programming Lab
48 pages
Uint 1 R
No ratings yet
Uint 1 R
40 pages
R Programming Unit 1
No ratings yet
R Programming Unit 1
83 pages
Esc-Csbs 601
No ratings yet
Esc-Csbs 601
9 pages
R Programming For Students
No ratings yet
R Programming For Students
40 pages
BDA Chapter6
No ratings yet
BDA Chapter6
78 pages
Unit - 1 Notes R Programming
No ratings yet
Unit - 1 Notes R Programming
52 pages
R Lang-Unit-01
100% (1)
R Lang-Unit-01
50 pages
Unit1 R
No ratings yet
Unit1 R
10 pages
R Programming Language Unit01
No ratings yet
R Programming Language Unit01
133 pages
Harnessing The Power of R in Business
No ratings yet
Harnessing The Power of R in Business
26 pages
Unit 1
No ratings yet
Unit 1
16 pages
R Programming Language E Notes - B.tech
No ratings yet
R Programming Language E Notes - B.tech
215 pages
R Programming Unit-1
No ratings yet
R Programming Unit-1
108 pages
Unit 1 Question - Answer
No ratings yet
Unit 1 Question - Answer
10 pages
Ayush Lab File R
No ratings yet
Ayush Lab File R
25 pages
Ashish Srivastava R Lab File
No ratings yet
Ashish Srivastava R Lab File
25 pages
Module 5 Introduction To R Programming
No ratings yet
Module 5 Introduction To R Programming
17 pages
Unit I - Introduction To R
No ratings yet
Unit I - Introduction To R
21 pages
R Programming Language
No ratings yet
R Programming Language
7 pages
Note 5-7
No ratings yet
Note 5-7
21 pages
Just Enough R: Learn Data Analysis with R in a Day
From Everand
Just Enough R: Learn Data Analysis with R in a Day
Sivakumaran Raman
3.5/5 (2)

DataAnalyticsUsingR Dr.P.rajesh

Uploaded by

DataAnalyticsUsingR Dr.P.rajesh

Uploaded by

One Day National Level Webinar on

 The help of descriptive analysis, we

 R was initially written at the Department of Statistics of the

University of Auckland in Auckland, New Zealand.

 R made its first appearance in 1993.

who can modify the R source code archive.

of R Studio. It facilitates interactive scripting in R.

the list of past commands that were executed by R. 20

command prompt by just typing the following command.

can start typing your program as follows.

> myString <- "Hello, World!"

> print ( myString)

[1] "Hello, World!"

Multiple Elements Vector - Using colon operator with numeric data

# Creating a sequence from 5 to 13.

# Create vector with elements from 5 to 9 incrementing by 0.4.

Using the c() function

# The logical and numeric values are converted to characters.

# Accessing vector elements using position.

# Create a list containing strings, numbers, vectors and a logical values.

 Matrices are the R objects in which the elements are arranged in a

matrix(data, nrow, ncol, byrow, dimnames)

# Elements are arranged sequentially by row.

# Elements are arranged sequentially by column.

# Define the column and row names.

P <- matrix(c(3:14), nrow = 4, byrow = TRUE,

dimnames = list(rownames, colnames))

# Define the column and row names.

# Create the matrix.

# Access the element at 3rd column and 1st row.

# Access the element at 2nd column and 4th row.

# Access only the 2nd row.

 A data frame is a table or structure.

 Each column contains values of one variable.

Following are the characteristics of a data frame

 The column names should be non-empty.

 The row names should be unique.

 The data stored in a data frame can be of numeric, factor or

 Each column should contain same number of data items. 43

 R can create csv file form using existing data frame.

 The write.csv() function is used to create the csv file.

 This file gets created in the working directory.

# Create a data frame.

data <- read.csv("input.csv")

retval <- subset(data, as.Date(start_date) > as.Date("2014-01-01"))

# Write filtered data into a new file.

newdata <- read.csv("output.csv")

4 383.5 32.5 147072.25 12463.75

# Capture the in R format

Month <- c(12, 11,10,9,8,7,6,5,4,3,2,1,12,11,10,9,8,7,6,5,4,3,2,1)

colors <- c("green","orange","brown")

You might also like