0% found this document useful (0 votes)
40 views15 pages

R Practicals (2007 Version)

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views15 pages

R Practicals (2007 Version)

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 15

ST.

MARY’S COLLEGE (AUTONOMOUS)


THOOTHUKUDI

M.Sc Computer Science(SSC)


(DATAMINING USING R)
2023-2024

Reg.No :

Name :

Semester :

Year :

ST.MARY’S COLLEGE (AUTONOMOUS)


THOOTHUKUDI

DEPARTMENT OF COMPUTER SCIENCE (SSC)


BONAFIDE CERTIFICATE

Reg.No :

Name :

Sub.Code :

This is to certify that this is the bonafide record of the practical work done in
DATAMINING USING R at St .Mary’s College (Autonomous), during the year 2023-2024,
submitted for the M.Sc. Computer Science Practical Examination held on
at St. Mary’s College (Autonomous), Thoothukudi.

Date: Staff-in-charge

Head of the department External Examiner


INDEX

EX.NO. DATE CONTENT PG. NO. SIGNATURE

1 27.11.2023 APRIORI ALGORITHM

2 06.12.2023 K-MEANS CLUSTERING

3 11.01.2024 HIERARCHICAL CLUSTERING

4 07.02.2024 CLASSIFICATION

5 13.02.2024 DECISION TREE

6 22.02.2024 LINEAR REGRESSION

7 06.03.2024 DATA VISUALIZATION

1. Apriori Algorithm
library(arules)

library(arulesViz)

library(RColorBrewer)

# import dataset

data("Groceries")

# using apriori() function

rules <- apriori(Groceries, parameter = list(supp = 0.01, conf = 0.2))

# using inspect() function

inspect(rules[1:10])

# using itemFrequencyPlot() function

arules::itemFrequencyPlot(Groceries, topN = 20,col = brewer.pal(8, 'Pastel2'),

main = 'Relative Item Frequency Plot',

type = "relative",

ylab = "Item Frequency (Relative)")


OUTPUT:

2. K- Means Clustering
set.seed(123)

data<- data.frame(x = c(rnorm(50, mean = 0), rnorm(50, mean = 5)),

y = c(rnorm(50, mean = 0), rnorm(50, mean = 5)))

plot(data$x, data$y, col = "blue", pch = 16, main = "Sample Data for K-Means Clustering")

kmeans_result<- kmeans(data, centers = 2, nstart = 20)

data$cluster<- kmeans_result$cluster

plot(data$x, data$y, col = data$cluster, pch = 16, main = "K-Means Clustering (K=2)")

points(kmeans_result$centers, col = 1:2, pch = 8, cex = 2)

cat("Cluster Centers:\n")

print(kmeans_result$centers)

OUTPUT:
3. Hierarchical Clustering
library(dplyr)

# Summary of dataset in package

head(mtcars)

# Finding distance matrix

distance_mat <- dist(mtcars, method = 'euclidean')

distance_mat

# Fitting Hierarchical clustering Model to training dataset

set.seed(240) # Setting seed

Hierar_cl <- hclust(distance_mat, method = "average")

Hierar_cl

# Plotting dendrogram

plot(Hierar_cl)

# Choosing no. of clusters

# Cutting tree by height

abline(h = 110, col = "green")

# Cutting tree by no. of clusters

fit <- cutree(Hierar_cl, k = 3 )

fit

table(fit)

rect.hclust(Hierar_cl, k = 3, border = "green")

OUTPUT:
4. Classification Algorithm

library(party)
# Create the input data frame.

input.data <- readingSkills[c(1:105), ]

# Create the tree.

output.tree <- ctree(nativeSpeaker ~ age + shoeSize + score, data = input.data)

# Plot the tree.

plot(output.tree)

OUTPUT:

5. Decision Tree

library(dataset)
library(caTools)

library(party)

library(dplyr)

library(magrittr)

data("readingSkills")

head(readingSkills)

sample_data = sample.split(readingSkills, SplitRatio = 0.8)

train_data<- subset(readingSkills, sample_data == TRUE)

test_data<- subset(readingSkills, sample_data == FALSE)

model<- ctree(nativeSpeaker ~ ., train_data)

plot(model)

OUTPUT:

6. Linear Regression
x <-c(151,174,138,186,128,136,179,163,152,131)

y <-c(63,81,56,91,47,57,76,72,62,48)

# Apply the lm() function.

relation<- lm(y~x)

print(summary(relation))

# Plot the chart.

plot(y,x,col="blue",main="Height & Weight Regression",

abline(lm(x~y)),cex=1.3,pch =16,xlab ="Weight in Kg",ylab="Height in cm")

OUTPUT:

7. Data Visualization

barplot(airquality$Ozone, main = 'Ozone Concenteration in air', xlab = 'ozone levels', horiz = TRUE)
barplot(airquality$Ozone, main = 'Ozone Concenteration in air', xlab = 'ozone levels', col ='blue',
horiz = FALSE)
data(airquality)
hist(airquality$Temp, main ="La Guardia Airport's\ Maximum Temperature(Daily)",
xlab ="Temperature(Fahrenheit)",
xlim = c(50, 125), col ="yellow",
freq = TRUE)
boxplot(airquality[, 0:4],main ='Box Plots for Air Quality Parameters')
boxplot(airquality$Wind, main = "Average wind speed\ at La Guardia Airport",
xlab = "Miles per hour", ylab = "Wind",
col = "orange", border = "brown",
horizontal = TRUE, notch = TRUE)
plot(airquality$Ozone, airquality$Month, main ="Scatterplot Example",
xlab ="Ozone Concentration in parts per billion",
ylab =" Month of observation ",
pch = 19)

OUTPUT:

You might also like