0% found this document useful (0 votes)
114 views

R PROGRAMMING LAB MANUAL

R lab manual
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
114 views

R PROGRAMMING LAB MANUAL

R lab manual
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

MALLA REDDY COLLEGE OF ENGINEERING

(Approved by AICTE-New Delhi, Affiliated to JNTUH-Hyderabad) Recognised under


Section 2(f) & 12(B) of the UGC Act 1956,
An ISO 9001:2015 Certified Institution.
Maisammaguda, Dhulapally, post via Kompally, Secunderabad - 500100

Department Of Computer Science and


Engineering–Data Science

R PROGRAMMING LAB MANUAL

Subject Code : DS504PC

Class : III Year I Sem

Regulation : R22

Academic Year : 2024-2025


SYLLABUS

B.TECH III Year I Sem. LTPC


0021
DS505PC: R PROGRAMMING LAB
Pre-requisites: Any programming language.
Course Objectives:
 Familiarize with R basic programming concepts, various data structures for
handling datasets, various graph representations and Exploratory Data
Analysis concepts
Course Outcomes:
 Setup R programming environment.

 Understand and use R – Data types and R – Data Structures.


 Develop programming logic using R – Packages.
 Analyze data sets using R – programming capabilities

LIST OF EXPERIMENTS:
1. Download and install R-Programming environment and install basic packages
using install. packages() command in R.
2. Learn all the basics of R-Programming (Data types, Variables, Operators etc,.)

3. Write R command to
i) Illustrate summation, subtraction, multiplication, and division operations
on vectors using vectors.
ii)Enumerate multiplication and division operations between matrices and
vectors in R console
4. Write R command to

i) Illustrates the usage of Vector subsetting and Matrix subsetting


ii)Write a program to create an array of 3×3 matrixes with 3 rows and 3 columns.
5. Write an R program to draw i) Pie chart ii) 3D Pie Chart, iii) Bar Chart along
with chart legend by considering suitable CSV file
6. Create a CSV file having Speed and Distance attributes with 1000 records. Write
R program to draw i) Box plots
ii) Histogram
iii) Line Graph
iv)Multiple line graphs
v) Scatter plot
to demonstrate the relation between the cars speed and the distance.
7. Implement different data structures in R (Vectors, Lists, Data Frames)

8. Writean R program to read a csv file and analyze the data in the file using EDA
(Explorative Data Analysis) techniques.
9. Writean R program to illustrate Linear Regression and Multi linear Regression
considering suitable CSV file
Experiment 1:Download and install R-Programming environment and install basic
packages using install. packages() command in R.

STEP BY STEP GUIDE TO INSTALL R :

1) Download the installable file from the following link: https://cran.r-


project.org/bin/windows/base/

2) Click on the R 3.2.2.exe file. The 3.2.2 is the version number of the file. The versions can be
updated as per the latest releases.

2) The SetUp will request permission to be installed on the system click yes to p

1
4) Select the preferred language from the drop down to begin an installation in that preferred
language.

5) Click next to proceed with the installation.

2
6.Choose the path where you wish to install R by clicking on browse and changing the
workspace locations. Click next to proceed with the default installation. Theminimum space
requirementsare mentioned at the bottomof the dialog box. Please check you have required
amount of free space in your drive

7) Choose the type of installation you require. By default R installs both the 32 and 64 bit
versions on your system.

8) To customize the startup options for R choose option and customize. To proceed with a
vanilla installation use Next.

3
9) Generate program short cuts and naming those as per your requirement specify the
necessary customizations. To proceed with the default installation hit next.

4
10) Click on the next button to begin installation.

11) After the installation has completed you will see the final screen.Click finish to
complete the installation.

5
12) Open Start Menu and you will find R in the available set of Programs.

13) Click on the R icon in the menu settings to open R.

6
Experiment 2:Learn all the basics of R-Programming (Data types, Variables, Operators etc,.)

The variables are assigned with R-Objects and the data type of the R-object becomes the data
type of the variable. There are many types of R-objects. The frequently used are:
 Vectors
 Lists
 Matrices
 Arrays
 Factors
 Data Frames

The simplest of these objects is the vector object and there are six data types of these atomic
vectors, also termed as six classes of vectors. The other R-Objects are built upon the atomic
vectors.

In R, c() function is used to create a vector. This function returns a one-dimensional array or
simply vector. The c() function is a generic function which combines its argument. All
arguments are restricted with a common data type which is the type of the returned value. There
are various other ways to create a vector in R, which are as follows:

1) Using the colon(:) operator

We can create a vector with the help of the colon operator. There is the following syntax to use
colon operator:

z<-x:y

This operator creates a vector with elements from x to y and assigns it to z.

Vectors :Tocreate vector with more than one element, use c() function which means combine
the elements into a vector.
# Create a vector.

apple <- c('red','green',"yellow")

print(apple)

# Get the class of the vector.

print(class(apple))

When we execute the above code, it produces the following result:

7
[1] "red" "green" "yellow"

[1] "character"

Lists
A list is an R-object which can contain many different types of elements inside it like vectors,
functions and even another list inside it.
# Create a list.

list1 <- list(c(2,5,3),21.3,sin)

# Print the list.

print(list1)

When we execute the above code, it produces the following result:

[[1]]

[1] 2 5 3

[[2]]

[1] 21.3

[[3]]

function (x) .Primitive("sin")

Matrices

A matrix is a two-dimensional rectangular data set. It can be created using a vector input to the
matrix function.
# Create a matrix.

M = matrix( c('a','a','b','c','b','a'), nrow=2,ncol=3,byrow = TRUE)

print(M)

When we execute the above code, it produces the following result:

[,1] [,2] [,3]

[1,] "a" "a" "b"

8
[2,] "c" "b" "a"

Arrays
While matrices are confined to two dimensions, arrays can be of any number of dimensions. The
array function takes a dim attribute which creates the required number of dimension. In the
below example we create an array with two elements which are 3x3 matrices each.
# Create an array.

a <- array(c('green','yellow'),dim=c(3,3,2))

print(a)

When we execute the above code, it produces the following result:


,,1

[,1] [,2] [,3]

[1,] "green" "yellow" "green"

[2,] "yellow" "green" "yellow"

[3,] "green" "yellow" "green"

,,2

[,1] [,2] [,3]

[1,] "yellow" "green" "yellow"

[2,] "green" "yellow" "green"

[3,] "yellow" "green" "yellow"

Variable: A variable provides us with named storage that our programs can manipulate. A
variable in R can store an atomic vector, group of atomic vectors or a combination of many R-
objects. A valid variable name consists of letters, numbers and the dot or underline characters.
The variable name starts with a letter or the dot not followed by a number.

Variable Name Validity Reason


var_name2. valid Has letters, numbers, dot and
underscore

var_name% Invalid Has the character '%'. Only


dot(.) and underscore allowed

9
2var_name invalid Starts with a number

.var_name , var.name valid Can start with a dot(.) but the


dot(.)should not be followed
by a number.

.2var_name invalid The starting dot is followed by


a number making it invalid

_var_name invalid Starts with _ which is not


valid

Operators:An operator is a symbol that tells the compiler to perform specific mathematical or
logical manipulations. R language is rich in built-in operators and provides following types of
operators.

Types of Operators :

 Arithmetic Operators
 Relational Operators
 Logical Operators
 Assignment Operators
 Miscellaneous Operators

!
Arithmetic Operators

Operator Description
> Checks if each element of the first vector is greater than the corresponding
element of the second vector.

< Checks if each element of the first vector is less than the corresponding
element of the second vector

== Checks if each element of the first vector is equal to the corresponding


element of the second vector

<= Checks if each element of the first vector is less than or equal to the
corresponding element of the second vector.

10
>= Checks if each element of the first vector is greater than or equal to the
corresponding element of the second vector

!= Checks if each element of the first vector is unequal to the corresponding


element of the second vector.

Logical Operators

Below table shows the logical operators supported by R language. It is applicable only to vectors
of type logical, numeric or complex. All numbers greater than 1 are considered as logical value
TRUE.
Each element of the first vector is compared with the corresponding element of the second
vector. The result of comparison is a Boolean value.

Operator Description
& It is called Element-wise Logical AND operator. It combines each element of the
first vector with the corresponding element of the second vector and gives a
output TRUE if both the elements are TRUE

| It is called Element-wise Logical OR operator. It combines each element of the


first vector with the corresponding element of the second vector and gives a
output TRUE if one the elements is TRUE.

! It is called Logical NOT operator. Takes each element of the vector and gives the
opposite logical value

Assignment Operators :
These operators are used to assign values to vectors.

Operator Description
<-
or
= Called Left Assignment
or
<<-
-> Called Right Assignment
or
->>

11
Experiment 3:Write R command to
a) Illustrate summation, subtraction, multiplication, and division operations on
vectors using vectors.

PROGRAM:

Summation of vectors:

# Define two vectors

vec1 <- c(1, 2, 3)

vec2 <- c(4, 5, 6)

# Perform summation

sum_result <- vec1 + vec2

print(sum_result)

Subtraction of of vectors:

# Perform subtraction

sub_result <- vec1 - vec2

print(sub_result)

12
Multiplication of vectors:

# Perform multiplication

mul_result <- vec1 * vec2

print(mul_result)

Division of vectors:

# Perform division

div_result <- vec2 / vec1

print(div_result)

13
Experiment 4 :Write R command to
i) Illustrates the usage of Vector subsetting and Matrix subsetting
ii) Write a program to create an array of 3×3 matrixes with 3 rows and 3
columns.

i) Illustrate the usage of Vector subsetting and Matrix subsetting:


PROGRAM:
# Vector subsetting

My _vector <- c(1, 2, 3, 4, 5)

subset_
vector <- my _vector[c(2, 4)] # Selecting elements at index 2 and 4
print(subset _vector)

# Matrix sub setting


my_ matrix <- matrix(1:9, nrow = 3)
subset_ matrix <- my_ matrix[2:3, 1:2] # Selecting rows 2 and 3, columns 1 and 2
print(subset _matrix)

14
ii) Write a program to create an array of 3×3 matrices with 3 rows and 3 columns:
PROGRAM:

\ # Create an array of 3x3 matrices


num_matrices <- 3
matrix_rows <- 3
matrix_cols <- 3

# Initialize an empty list to store matrices


matrix_list <- list()

# Populate the list with matrices


for (i in 1:num_matrices) {

# Create a 3x3 matrix


my_matrix <- matrix(1:(matrix_rows * matrix_cols), nrow = matrix_rows)

# Add the matrix to the list


matrix_list[[i]] <- my_matrix
}

# Convert the list of matrices to an array


my_array <- array(matrix_list, dim = c(matrix_rows, matrix_cols, num_matrices))

# Print the array


print(my_array)

15
Experiment 5 - Write an R program to draw i) Pie chart ii) 3D Pie Chart, iii) Bar Chart along
with chart legend by considering suitable CSV file
PROGRAM:

i. Pie chart

# Create data for the graph.

geeks<- c(23, 56, 20, 63)

labels <- c("Mumbai", "Pune", "Chennai", "Bangalore")

# Plot the chart with title and rainbow

# color pallet.

pie(geeks, labels, main = "City pie chart",

col = rainbow(length(geeks)))

16
3D pie chart: A 3D pie chart showing the same data as the regular pie chart

PROGRAM:

# Get the library.


library(plotrix)

# Create data for the graph.


geeks <- c(23, 56, 20, 63)
labels <- c("Mumbai", "Pune", "Chennai", "Bangalore")

piepercent<- round(100 * geeks / sum(geeks), 1)

# Plot the chart.


pie3D(geeks, labels = piepercent,
main = "City pie chart", col = rainbow(length(geeks)))
legend("topright", c("Mumbai", "Pune", "Chennai", "Bangalore"),
cex = 0.5, fill = rainbow(length(geeks)))

17
Bar chart: A bar chart showing the values of the variable_for_chart variable for each category in
the category _variable variable.

PROGRAM:
colors = c("green", "orange", "brown")
months <- c("Mar", "Apr", "May", "Jun", "Jul")
regions <- c("East", "West", "North")
# Create the matrix of the values.
Values <- matrix(c(2, 9, 3, 11, 9, 4, 8, 7, 3, 12, 5, 2, 8, 10, 11),
nrow = 3, ncol = 5, byrow = TRUE)
# Create the bar chart
barplot(Values, main = "Total Revenue", names.arg = months,
xlab = "Month", ylab = "Revenue", col = colors)
# Add the legend to the chart
legend("topleft", regions, cex = 0.7, fill = colors)

18
Experiment 6 Create a CSV file having Speed and Distance attributes with 1000 records. Write
R program to draw
a) Box plots
b) Histogram
c) Line Graph
d) Multiple line graphs
e) Scatter plot

a) Box plots

# Load the dataset


data(mtcars)
# Set up plot colors
my_colors <- c("#FFA500", "#008000", "#1E90FF", "#FF1493")
# Create the box plot with customized aesthetics
boxplot(disp ~ gear, data = mtcars,
main = "Displacement by Gear", xlab = "Gear", ylab = "Displacement",
col = my_colors, border = "black", notch = TRUE, notchwidth = 0.5,
medcol = "white", whiskcol = "black", boxwex = 0.5, outpch = 19,
outcol = "black")
# Add a legend
legend("topright", legend = unique(mtcars$gear),
fill = my_colors, border = "black", title = "Gear")

b) Histogram

PROGRAM:
# Create data for the graph.
v <- c(19, 23, 11, 5, 16, 21, 32,
14, 19, 27, 39)
# Create the histogram.
hist(v, xlab = "No.of Articles ",
col = "green", border = "black")

19
c) Line Graph

PROGRAM:

# Create the data for the chart.


v <- c(17, 25, 38, 13, 41)
# Plot the bar chart.
plot(v, type = "o")

d) Multiple line graphs

PROGRAM:

library("ggplot2")

gfg_data <- data.frame(x = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),


y1 = c(1.1, 2.4, 3.5, 4.1, 5.9, 6.7,
7.1, 8.3, 9.4, 10.0),
y2 = c(7, 5, 1, 7, 4, 9, 2, 3, 1, 4),
y3 = c(5, 6, 4, 5, 1, 8, 7, 4, 5, 4),
y4 = c(1, 4, 8, 9, 6, 1, 1, 8, 9, 1),
y5 = c(1, 1, 1, 3, 3, 7, 7, 10, 10, 10))

gfg_plot <- ggplot(gfg_data, aes(x)) +


geom_line(aes(y = y1), color = "black") +

geom_line(aes(y = y2), color = "red") +


geom_line(aes(y = y3), color = "green") +
geom_line(aes(y = y4), color = "blue") +
geom_line(aes(y = y5), color = "purple")
gfg_plot

20
e) Scatter plot
PROGRAM:

# Get the input values.


input <- mtcars[, c('wt', 'mpg')]

# Plot the chart for cars with


# weight between 1.5 to 4 and
# mileage between 10 and 25.
plot(x = input$wt, y = input$mpg,
xlab = "Weight",
ylab = "Milage",
xlim = c(1.5, 4),
ylim = c(10, 25),

main = "Weight vs Milage"


)

21
Experiment 7-Implement different data structures in R (Vectors, Lists, Data Frames)

Vectors

PROGRAM:

# R program to illustrate Vector


# Vectors(ordered collection of same data type)

25

X = c(1, 3, 5, 7, 8)
# Printing those elements in console
print(X)

Lists:

Lists:
Lists can hold elements of different data types and can be nested.
empId = c(1, 2, 3, 4)

empName = c(&quot;Debi&quot;, &quot;Sandeep&quot;, &quot;Subham&quot;,


&quot;Shiba&quot;)

numberOfEmp = 4

empList = list(empId, empName, numberOfEmp)

print(empList)

22
PROGRAM:

Name = c(&quot;Anand&quot;, &quot;kamal&quot;, &quot;Pandu&quot;)

Language = c(&quot;R&quot;, &quot;Python&quot;, &quot;Java&quot;)

Age = c(22, 25, 45)

df = data.frame(Name, Language, Age)

26

print(df)

23
Experiment 8-Write an R program to read a csv file and analyze the data in the file using EDA
(Explorative Data Analysis) techniques.

# Install and load necessary packages


install. packages(c( "readr ", "dplyr", "ggplot2"))

library(readr)
library(dplyr)
library(ggplot2)

# Step 1: Read the CSV file


file_ path<- "data.csv"
data <- read_ csv(file_path)

# Step 2: Display the first few rows of the data


cat("First few rows of the data:\n")
print(head(data))

# Step 3: Summary statistics


cat("\n Summary statistics:\n")
print(summary(data))

# Step 4: Data structure


cat("\n Data structure:\n")
str(data)

# Step 5: Exploratory Data Analysis (EDA) - Example: Histogram


g gplot (data, aes(x = Variable_ of_Interest)) +
geom_ histogram(binwidth = 5, fill = "blue", color = "black") +
labs(title = "Histogram of Variable_ of_Interest", x = "Variable_ of_Interest", y = "Frequency")

# Add more EDA visualizations and analyses as needed based on your data

# Save plots (optional)


ggsave ("histogram.png", plot = last_ plot())

# Close the graphics device (optional)

dev. off()

24
PROGRAM :DATA INSPECTION IN EDA

# Data Inspection in EDA


# loading the required packages
library(aqp)
library(soilDB)

# Load from the loafercreek dataset


data("loafercreek")

# Construct generalized horizon designations


n < - c("A", "BAt", "Bt1", "Bt2", "Cr", "R")

# REGEX rules
p < - c("A", "BA|AB", "Bt|Bw", "Bt3|Bt4|2B|C","Cr", "R")

# Compute genhz labels and


# add to loafercreek dataset
loafercreek$genhz < - generalize.hz(loafercreek$hzname,n, p)

# Extract the horizon table


h < - horizons(loafercreek)

# Examine the matching of pairing of


# the genhz label to the hznames
table(h$genhz, h$hzname)

vars < - c("genhz", "clay", "total_frags_pct","phfield", "effclass")


summary(h[, vars])

sort(unique(h$hzname))
h$hzname < - ifelse(h$hzname == "BT","Bt nu", h$hzname)

25
PROGRAM(GRAPHICAL METHOD)
# EDA Graphical Method Distributions

# loading the required packages


library("ggplot2")
library(aqp)
library(soilDB)

# Load from the loafercreek dataset


data("loafercreek")

# Construct generalized horizon designations


n <- c("A", "BAt", "Bt1", "Bt2", "Cr", "R")

# REGEX rules
p <- c("A", "BA|AB", "Bt|Bw", "Bt3|Bt4|2B|C","Cr", "R")

# Compute genhz labels and add


# to loafercreek dataset
loafercreek$genhz <- generalize.hz(loafercreek$hzname, n, p)

# Extract the horizon table


h <- horizons(loafercreek)

# Examine the matching of pairing


# of the genhz label to the hznames
table(h$genhz, h$hzname)

vars <- c("genhz", "clay", "total_frags_pct", "phfield", "effclass")


summary(h[, vars])

26
sort(unique(h$hzname))
h$hzname <- ifelse(h$hzname == "BT", "Bt", h$hzname)

# graphs
# bar plot
ggplot(h, aes(x = texcl)) +geom_bar()

# histogram
ggplot(h, aes(x = clay)) +
geom_histogram(bins = nclass.Sturges(h$clay))

# density curve
ggplot(h, aes(x = clay)) + geom_density()

# box plot
ggplot(h, (aes(x = genhz, y = clay))) +
geom_boxplot()

# QQ Plot for Clay


ggplot(h, aes(sample = clay)) +
geom_qq() +
geom_qq_line()

27
Experiment 9-Write an R program to illustrate Linear Regression and Multi linear Regression
considering suitable CSV file.

# Linear Regression and Multi linear Regression example using the mtcars dataset

# Step 1: Load necessary libraries


library(ggplot2)
library(tidyr)
library(dplyr)

# Step 2: Load the mtcars dataset (or use your own CSV file)
data (mtcars)

# Step 3: Simple Linear Regression (Example: mpg vs. horsepower)


cat("Simple Linear Regression:\n")
lm_model<- lm(mpg ~ horsepower, data = mtcars)
summary(lm_model)

# Plotting the Simple Linear Regression line


ggplot(mtcars, aes(x = horsepower, y = mpg)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE, color = "blue") +
labs(title = "Simple Linear Regression", x = "Horsepower", y = "Miles per Gallon")

# Step 4: Multi Linear Regression (Example: mpg vs. horsepower + weight)


cat("\nMulti Linear Regression:\n")
mlm_model<- lm(mpg ~ horsepower + weight, data = mtcars)
summary(mlm_model)

# Step 5: Predicting using the Multi Linear Regression model


new_data<- data.frame(horsepower = c(150, 200), weight = c(3000, 3500))
predictions <- predict(mlm_model, newdata = new_data)
cat("\nPredictions using Multi Linear Regression:\n")
print(data.frame(new_data, predictions))

# Add more features to the Multi Linear Regression model as needed

# Save plots (optional)


ggsave("linear_regression_plot.png", plot = last_plot())

# Close the graphics device (optional)


dev.off()

28
PROGRAM(LINEAR REGRESSION)
# R program to illustrate
# Linear Regression

# Height vector
x <- c(153, 169, 140, 186, 128,136, 178, 163, 152, 133)

# Weight vector
y <- c(64, 81, 58, 91, 47, 57,75, 72, 62, 49)

# Create a linear regression model


model <- lm(y~x)

# Print regression model


print(model)

# Find the weight of a person


# With height 182
df <- data.frame(x = 182)
res <- predict(model, df)
cat("\nPredicted value of a person with height = 182")
print(res)

# Output to be present as PNG file


png(file = "linearRegGFG.png")

# Plot
plot(x, y, main = "Height vs Weight Regression model")
abline(lm(y~x))

# Save the file.


dev.off()

29
PROGRAM: MULTIPLE REGRESSION

# R program to illustrate
# Multiple Linear Regression

# Using airquality dataset


input <- airquality[1:50,
c("Ozone", "Wind", "Temp")]

# Create regression model


model <- lm(Ozone~Wind + Temp, data = input)

# Print the regression model


cat("Regression model:\n")
print(model)

# Output to be present as PNG file


png(file = "multipleRegGFG.png")

# Plot
plot(model)

# Save the file.


dev.off()

30
Lead Programs:
1. Linear Discriminant Analysis is computed using the lda() function. Let’s use the iris data
set of R Studio.

library(MASS)
library(tidyverse)
library(caret)
theme_set(theme_classic())

# Load the data


data("iris")

# Split the data into training (80%) and test set (20%)
set.seed(123)
training.individuals <- iris$Species %>%
createDataPartition(p = 0.8, list = FALSE)
train.data <- iris[training.individuals, ]
test.data <- iris[-training.individuals, ]

# Estimate preprocessing parameters


preproc.parameter <- train.data %>%
preProcess(method = c("center", "scale"))

# Transform the data using the estimated parameters


train.transform <- preproc.parameter %>% predict(train.data)
test.transform <- preproc.parameter %>% predict(test.data)

# Fit the model


model <- lda(Species~., data = train.transform)

# Make predictions
predictions <- model %>% predict(test.transform)

# Model accuracy
mean(predictions$class==test.transform$Species)

model <- lda(Species~., data = train.transform)


model

31
2. Decision Tree for Regression in R Programming.

# Load the required library


library(rpart)

# Load a sample dataset


data(mtcars)

# Create a CART model for regression


cart_model <- rpart(mpg ~ ., data = mtcars)

# Print the model summary


print(cart_model)

# Make predictions using the model


predictions <- predict(cart_model, newdata = mtcars)

32

You might also like