R Lab Manual
R Lab Manual
1. Demonstrate the steps for installation of R and R Studio. Perform the following:
a) Assign different type of values to variables and display the type of variable. Assign different
types such as Double, Integer, Logical, Complex and Character and understand the difference
between each data type.
# Double
double_var <- 3.14
print(class(double_var))
# Integer
int_var <- 42L
print(class(int_var))
# Logical
logical_var <- TRUE
print(class(logical_var))
# Complex
complex_var <- 2 + 3i
print(class(complex_var))
# Character
char_var <- "Hello, World!"
print(class(char_var))
Output:
a <- 5
b <- 3
sum_ab <- a + b
diff_ab <- a - b
product_ab <- a * b
quotient_ab <- a / b
# Logical operations
x <- TRUE
y <- FALSE
and_result <- x && y # Logical AND
or_result <- x 11 y # Logical OR
not_x <- !x # Logical NOT
Output:
# Sequence from 1 to 10
seq_vector <- 1:10
# Sequence from 1 to 10 with a step of 2
seq_vector_step <- seq(1, 10, by = 2)
print(seq_vector)
print(seq_vector_step)
Output:
Output:
Output:
# Vectors
vec <- c(10, 20, 30, 40, 50)
element <- vec[3] # Extract the third element
# Matrices
mat <- matrix(1:9, nrow = 3, ncol = 3)
element_mat <- mat[2, 2] # Extract element in the second row and second column
Output:
2. Assess the Financial Statement of an Organization being supplied with 2 vectors of data:
Monthly Revenue and Monthly Expenses for the Financial Year. You can create your own
sample data vector for this experiment)
Calculate the following financial metrics:
a. Profit for each month.
# Sample data for monthly revenue and monthly expenses (in thousands of dollars)
monthly_revenue <- c(100, 120, 130, 110, 140, 160, 150, 130, 140, 170, 180, 160)
monthly_expenses <- c(70, 80, 85, 75, 90, 100, 95, 80, 85, 95, 100, 90)
c. Profit margin for each month equals to profit after tax divided by revenue.
# Calculate profit margin for each month (profit after tax divided by revenue)
profit_margin <- (profit_after_tax / monthly_revenue) * 100
d. Good Months – where the profit after tax was greater than the mean for the year.
# Calculate the mean profit after tax for the year
mean_profit_after_tax <- mean(profit_after_tax)
# Find the good months (where profit after tax was greater than the mean for the year)
good_months <- which(profit_after_tax > mean_profit_after_tax)
e. Bad Months – where the profit after tax was less than the mean for the year.
# Find the bad months (where profit after tax was less than the mean for the year)
bad_months <- which(profit_after_tax < mean_profit_after_tax)
f. The best month – where the profit after tax was max for the year.
# Find the best month (where profit after tax was maximum for the year)
best_month <- which.max(profit_after_tax)
g. The worst month – where the profit after tax was min for the year.
# Find the worst month (where profit after tax was minimum for the year)
worst_month <- which.min(profit_after_tax)
Output:
3. Develop a program to create two 3 X 3 matrices A and B and perform the following
operations
a) Transpose of the matrix b) addition
c) subtraction d) multiplication
# Create Matrix A
A <- matrix(1:9, nrow = 3, byrow = TRUE)
cat("Matrix A:\n")
print(A)
# Create Matrix B
B <- matrix(10:18, nrow = 3, byrow = TRUE)
cat("Matrix B:\n")
print(B)
# Transpose of Matrix A
cat("Transpose of Matrix A:\n")
print(t(A))
print(A * B)
Output:
4. Develop a program to find the factorial of given number using recursive function calls.
factorial <- function(n) {
if(n <= 1) {
return(1)
} else {
return(n * factorial(n-1))
}
}
input <- readline(prompt = "Enter an integer number: ")
# Convert the input to an integer
n <- as.integer(input)
fact<-factorial(n)
cat('Factorial of the given number is',fact)
Output:
Enter an integer number: 5
Factorial of the given number is 120
5. Develop an R Program using functions to find all the prime numbers up to a specified number
by the method of Sieve of Eratosthenes.
sieve <- function(n)
{
n <- as.integer(n)
#if(n > 1e6) stop("n too large")
primes <- rep(TRUE, n)
primes[1] <- FALSE
last.prime <- 2
for(i in last.prime:floor(sqrt(n)))
{
sieve(100)
Output:
6. The built-in data set mammals contain data on body weight versus brain weight. Develop R
commands to:
a) Find the Pearson and Spearman correlation coefficients. Are they similar?
b) Plot the data using the plot command.
c) Plot the logarithm (log) of each variable and see if that makes a difference.
install.packages("MASS")
library(MASS)
# Load the dataset
data("mammals")
mammals_clean <- mammals[complete.cases(mammals), ]
# View the first few rows of the dataset
head(mammals_clean)
# Spearman correlation
spearman_corr <- cor(mammals_clean$brain, mammals_clean$body, method = "spearman")
cat("Spearman Correlation Coefficient:", spearman_corr, "\n")
# b) Plot the data using the plot command
plot(mammals_clean$body, mammals_clean$brain, main = "Body Weight vs. Brain Weight",
xlab = "Body Weight (g)", ylab = "Brain Weight (g)", pch = 19)
7. Develop R program to create a Data Frame with following details and do the following
operations.
a) Subset the Data frame and display the details of only those items whose price is greater than
or equal to 350.
b) Subset the Data frame and display only the items where the category is either “Office
Supplies” or “Desktop Supplies”
c) Create another Data Frame called “item-details” with three different fields itemCode,
ItemQtyonHand and ItemReorderLvl and merge the two frames
# b) Subset the Data frame and display items with category "Office Supplies" or "Desktop
Supplies"
# c) Create another Data Frame called "item-details" with three fields and merge the two
frames
item_details <- data.frame(
itemCode = c("A123", "B456", "C789", "D101", "E202"),
ItemQtyonHand = c(100, 25, 50, 200, 10),
ItemReorderLvl = c(20, 5, 10, 100, 3)
)
Output:
8. Let us use the built-in dataset air quality which has Daily air quality measurements in New
York, May to September 1973. Develop R program to generate histogram by using appropriate
arguments for the following statements.
a) Assigning names, using the air quality data set.
b) Change colors of the Histogram
c) Remove Axis and Add labels to Histogram
d) Change Axis limits of a Histogram
e) Add Density curve to the histogram
Concentration")
Output:
9. Design a data frame in R for storing about 20 employee details. Create a CSV file named
“input.csv” that defines all the required information about the employee such as id, name,
salary, start_date, dept. Import into R and do the following analysis.
a) Find the total number rows & columns
b) Find the maximum salary
c) Retrieve the details of the employee with maximum salary
d) Retrieve all the employees working in the IT Department.
e) Retrieve the employees in the IT Department whose salary is greater than 20000 and write
these details into another file “output.csv”
# e) Retrieve the employees in the IT Department whose salary is greater than 20000
it_department_high_salary <- it_department_employees[it_department_employees$salary >
20000, ]
Output:
10. Using the built in dataset mtcars which is a popular dataset consisting of the design and fuel
consumption patterns of 32 different automobiles. The data was extracted from the 1974
Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile
design and performance for 32 automobiles (1973-74 models). Format A data frame with 32
observations on 11 variables :
[1] mpg Miles/(US) gallon, [2] cyl Number of cylinders [3] disp Displacement (cu.in.),
[4] hp Gross horsepower [5] drat Rear axle ratio, [6] wt Weight (lb/1000)
[7] qsec 1/4 mile time, [8] vs V/S, [9] am Transmission (0 = automatic, 1 = manual),
[10] gear Number of forward gears, [11] carb Number of carburetors
b) Find the car with the largest hp and the least hp using suitable functions
c) Plot histogram / density for each variable and determine whether continuous variables are
normally distributed or not. If not, what is their skewness?
d) What is the average difference of gross horse power(hp) between automobiles with 3 and 4
number of cylinders(cyl)? Also determine the difference in their standard deviations.
e) Which pair of variables has the highest Pearson correlation?
install.packages("moments")
library(moments)
# Load the mtcars dataset
data(mtcars)
# a) What is the total number of observations and variables in the dataset?
num_observations <- nrow(mtcars)
num_variables <- ncol(mtcars)
# b) Find the car with the largest hp and the least hp using suitable functions
car_with_max_hp <- mtcars[which.max(mtcars$hp), ]
car_with_min_hp <- mtcars[which.min(mtcars$hp), ]
# c) Plot histogram / density for each variable and determine whether continuous variables are
normally distributed or not.
# If not, what is their skewness?
par(mfrow = c(4, 3)) # Setting up a 4x3 grid for plots
# d) What is the average difference of gross horsepower (hp) between automobiles with 3 and
4 cylinders (cyl)?
# Also determine the difference in their standard deviations.
average_hp_diff <- mean(mtcars[mtcars$cyl == 3, "hp"]) - mean(mtcars[mtcars$cyl == 4, "hp"])
std_deviation_diff <- sd(mtcars[mtcars$cyl == 3, "hp"]) - sd(mtcars[mtcars$cyl == 4, "hp"])
Output:
11. Demonstrate the progression of salary with years of experience using a suitable data set
(You can create your own dataset). Plot the graph visualizing the best fit line on the plot of
the given data points. Plot a curve of Actual Values vs. Predicted values to show their
correlation and performance of the model. Interpret the meaning of the slope and y-intercept
of the line with respect to the given data. Implement using lm function. Save the graphs and
coefficients in files. Attach the predicted values of salaries as a new column to the original
data set and save the data as a new CSV file.
Output:
Interpretation:
The slope (coefficient for Experience) represents the change in salary for ea
ch additional year of experience. In this case, it indicates how much salary
increases with each additional year of experience.
The y-intercept (coefficient for the intercept term) represents the estimated
salary when years of experience is zero. It may not have a meaningful interpr
etation in this context, as no one has zero years of experience.
salary_data_with_predictions.csv