Data Analytics Using R Lab - Master Manual
Data Analytics Using R Lab - Master Manual
1
DATA ANALYTICS USING R LAB MANUAL
CSE(AI & ML) Department Vision & Mission
Vision:
To produce admirable and competent graduates & experts in Artificial Intelligence &
Machine Learning by quality technical education, innovations and research to
improve the life style in the society.
Mission:
M1: To impart value based technical education in AI & ML through innovative
teaching and learning methods.
M2: To produce outstanding professionals by imparting quality training, hands-on-
experience and value based education.
M3: To produce competent graduates suitable for industries and organizations at global
level including research and development with Social responsibility.
2
DATA ANALYTICS USING R LAB MANUAL
3
DATA ANALYTICS USING R LAB MANUAL
LAB CODE
Students should report to the concerned lab as per the time table.
Students who turn up late to the labs will in no case be permitted to do the
program schedule for the day.
After completion of the program, certification of the concerned staff in-
charge in the observation book is necessary.
Student should bring a notebook of 100 pages and should enter the readings
/observations into the notebook while performing the experiment.
The record of observations along with the detailed experimental procedure of
the experiment in the immediate last session should be submitted and certified
staff member in-charge.
The group-wise division made in the beginning should be adhered to and no
mix up of students among different groups will be permitted.
When the experiment is completed, should disconnect the setup made by
them, and should return all the components/instruments taken for the purpose.
Any damage of the equipment or burn-out components will be viewed
seriously either by putting penalty or by dismissing the total group of students
from the lab for the semester/year.
Students should be present in the labs for total scheduled duration.
Students are required to prepare thoroughly to perform the experiment before
coming to laboratory.
4
DATA ANALYTICS USING R LAB MANUAL
INDEX
Data Preprocessing
a. Handling missing values
1 b. Noise detection removal
c. Identifying data redundancy and elimination
5
DATA ANALYTICS USING R LAB MANUAL
Program No. : 1
Date:
Problem Statement:
Data Preprocessing
a. Handling missing values
b. Noise detection removal
c. Identifying data redundancy and elimination
Source Code:
6
DATA ANALYTICS USING R LAB MANUAL
median_imputation <- function(x) {
x[is.na(x)] <- median(x, na.rm = TRUE)
return(x)
}
data_median_imputed <- as.data.frame(lapply(data, median_imputation))
cat("\nData after median imputation:\n")
print(data_median_imputed)
Output :
Original Data:
A B C
1 NA 1
2 2 2 2
3 NA 3 3
4 4 NA 4
5 5 5 NA
7
DATA ANALYTICS USING R LAB MANUAL
8
DATA ANALYTICS USING R LAB MANUAL
Output:
Original Data:
[1] 1 2 3 100 5 6 7 200 9 10
9
DATA ANALYTICS USING R LAB MANUAL
10
DATA ANALYTICS USING R LAB MANUAL
Output:
Original Data:
ID Name Age Gender
1 1 John 25 Male
2 2 Alice 30 Female
3 3 Bob 35 Male
4 4 John 25 Male
5 5 Alice 30 Female
Redundant Rows:
ID Name Age Gender
4 4 John 25 Male
5 5 Alice 30 Female
11
DATA ANALYTICS USING R LAB MANUAL
Program. No. : 2
Date:
Problem Statement: Implement any one imputation model
Source Code:
# Sample data with missing values
data <- data.frame(
A = c(1, 2, NA, 4, 5),
B = c(NA, 2, 3, NA, 5),
C = c(1, 2, 3, 4, NA)
)
12
DATA ANALYTICS USING R LAB MANUAL
Output:
Original Data:
A B C
1 1 NA 1
2 2 2 2
3 NA 3 3
4 4 NA 4
5 5 5 NA
13
DATA ANALYTICS USING R LAB MANUAL
Program. No. : 3
Date:
Source Code:
# Sample data
x <- c(1, 2, 3, 4, 5)
y <- c(2, 3, 4, 5, 6)
# Add legend
legend("topright", legend = "Regression Line", col = "red", lty = 1, cex = 0.8)
Output:
Regression Coefficients:
(Intercept) x
1 1
14
DATA ANALYTICS USING R LAB MANUAL
Program No. : 4
Date:
Source Code:
# Sample data
x <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
y <- c(0, 0, 0, 0, 1, 1, 1, 1, 1, 1)
# Add legend
legend("topright", legend = "Logistic Regression Curve", col = "red", lty = 1, cex = 0.8)
Output:
Regression Coefficients:
15
DATA ANALYTICS USING R LAB MANUAL
Program No. : 5
Date:
Source Code:
# Sample data
data <- data.frame(
Feature1 = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
Feature2 = c(0, 1, 0, 1, 0, 1, 0, 1, 0, 1),
Class = c("A", "B", "A", "B", "A", "B", "A", "B", "A", "B")
)
Output:
Decision Rules:
n= 10
16
DATA ANALYTICS USING R LAB MANUAL
Program No. : 6
Date:
Source Code:
# Sample data
data <- iris
# Output predictions
cat("Predictions:\n")
print(predictions)
17
DATA ANALYTICS USING R LAB MANUAL
Output:
Predictions:
[1] setosa setosa setosa setosa setosa setosa setosa
[8] setosa setosa setosa setosa setosa setosa setosa
[15] setosa setosa setosa setosa setosa setosa setosa
[22] setosa setosa setosa setosa setosa setosa setosa
[29] setosa setosa setosa setosa setosa setosa setosa
[36] setosa setosa setosa setosa setosa setosa setosa
[43] versicolor versicolor versicolor versicolor versicolor versicolor versicolor
[50] versicolor versicolor versicolor versicolor versicolor versicolor versicolor
[57] versicolor versicolor versicolor versicolor versicolor versicolor versicolor
[64] versicolor versicolor versicolor versicolor versicolor versicolor versicolor
[71] versicolor versicolor versicolor versicolor versicolor versicolor versicolor
[78] versicolor versicolor versicolor versicolor versicolor versicolor versicolor
[85] versicolor versicolor versicolor versicolor versicolor versicolor versicolor
[92] virginica versicolor versicolor versicolor versicolor versicolor versicolor
[99] versicolor versicolor versicolor versicolor versicolor versicolor versicolor
[106] virginica virginica virginica virginica virginica virginica virginica
[113] virginica virginica virginica virginica virginica virginica virginica
[120] virginica virginica virginica virginica virginica virginica virginica
[127] virginica virginica virginica virginica virginica virginica virginica
[134] virginica virginica virginica virginica virginica virginica virginica
[141] virginica virginica virginica virginica virginica virginica virginica
[148] virginica virginica virginica virginica
Levels: setosa versicolor virginica
18
DATA ANALYTICS USING R LAB MANUAL
Program No. : 7
Date:
Source code:
# Install and load the forecast package if not already installed
if (!requireNamespace("forecast", quietly = TRUE)) {
install.packages("forecast")
}
library(forecast)
Output:
19
DATA ANALYTICS USING R LAB MANUAL
Program No. : 8
Date:
Source Code:
# Sample data
set.seed(123)
data <- matrix(rnorm(100), ncol = 2)
# Determine clusters
k <- 3
clusters <- cutree(hc, k)
Output:
Cluster Assignments:
[1] 2 2 1 1 1 1 1 1 3 3 2 3 1 1 3 1 3 3 1 1 1 1 3 1 3 2 2 3 3 1 2 2 2 3 2 2 2
[38] 1 3 2 1 3 2 2 1 3 1 3 2 2 2 2 2 1 3 3 2 1 3 1 1 2 2 2 2 2 1 1 1 2 3 1 1 1
[75] 1 1 1 1 2 3 3 3 2 1 1 3 2 2 3 1 1 2 2 3 1 1 2 2 2
20
DATA ANALYTICS USING R LAB MANUAL
Program No. : 9
Date:
Problem Statement: Perform Visualization techniques (types of maps - Bar, Colum, Line,
Scatter, 3D Cubes etc)
Source Code
Path of the file to read
flight_filepath = "../input/flight_delays.csv"
# Add title
plt.title("Average Arrival Delay for Spirit Airlines Flights, by Month")
# Bar chart showing average arrival delay for Spirit Airlines flights by month
sns.barplot(x=flight_data.index, y=flight_data['NK'])
Output :
21
DATA ANALYTICS USING R LAB MANUAL
Line Graph:
Source Code:
Output:
22
DATA ANALYTICS USING R LAB MANUAL
Scatter Graph:
Source Code:
sns.scatterplot(x=insurance_data['bmi'], y=insurance_data['charges'])
Output:
23
DATA ANALYTICS USING R LAB MANUAL
Program No. : 10
Date:
Source Code:
# Summary statistics
summary_stats <- summary(healthcare_data)
print(summary_stats)
24
DATA ANALYTICS USING R LAB MANUAL
Output:
25
DATA ANALYTICS USING R LAB MANUAL
Program NO. : 11
Date:
Source Code:
# Summary statistics
summary_stats <- summary(sales_data)
print(summary_stats)
26
DATA ANALYTICS USING R LAB MANUAL
geom_line(data = test_data, aes(x = date, y = predicted_sales), color = "red", linetype = "dashed") +
labs(title = "Actual vs. Predicted Sales", x = "Date", y = "Sales")
print(actual_vs_predicted_plot)
Output:
27
PROGRAMMING IN PYTHON LAB MANUAL
Program NO. : 12
Problem Statement: Apply Predictive analytics for Weather forecasting
Source Code:
# Summary statistics
summary_stats <- summary(weather_data)
print(summary_stats)
28
PROGRAMMING IN PYTHON LAB MANUAL
Output:
29