0% found this document useful (0 votes)
44 views9 pages

CSE 3121 Information Visualization R Studio All Codes

Uploaded by

Dhaarani Pushpam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views9 pages

CSE 3121 Information Visualization R Studio All Codes

Uploaded by

Dhaarani Pushpam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

CSE 3121 Information Visualization

Name:V.Yuthish Kumar
Regno:21MIA1023

Lab Assignment -2 Association Rule Mining in R

The Apriori algorithm is being used to perform Association Rule mining and the “Adults”
Dataset is being used.
Support measures the frequency of occurrence of a particular itemset in a
dataset.Higher support values indicate that the itemset occurs more frequently in the dataset.

Confidence measures the reliability or certainty of the association rule.A high


confidence value indicates that the occurrence of the consequent item (B) is highly likely given
the occurrence of the antecedent item (A).

Code:

library(arules)
# Load the Adult dataset
data("Adult")

# Explore the Adult dataset


summary(Adult)
inspect(head(Adult)) # showing few rows from the dataset

# Perform association rule mining using apriori algorithm


rules <- apriori(Adult, parameter = list(support = 0.01, minlen = 2))

# Explore the generated rules


summary(rules)
inspect(head(rules))

Output:
Inference:
Support value of 0.01 indicates that (education = Doctorate) and (race=white) appear in
1 percent of rows of the dataset .The total transactions are 48842 ,so the given combination will
appear 493 times. Similarly we can infer various combinations.

Confidence indicates the reliability of association rule; a high confidence value like 0.96
indicates that if (education=5th -6th) appears (capital-loss=None) will also appear most of the
time.

Lab Assignment -3 Clustering in R


We perform K-Means Clustering in R for the the USArrests Dataset
Code:
# Loading necessary libraries
library(ggplot2)

#reading the data


df <- USArrests
View(df)
# Perform k-means clustering for murder and assault columns
km <- kmeans(df[, c("Murder", "Assault")], centers = 3, nstart = 25)
km

# Add cluster assignment to original dataset


df$cluster <- as.factor(km$cluster)

# Visualizing clusters
ggplot(df, aes(x = Murder, y = Assault, color = cluster)) +
geom_point() +
labs(title = "Visualiztion of Clusters on UsArrests Dataset",
x = "Murder Rate",
y = "Assault Rate")

Output:
These are some of the rows on the USArrests Dataset

We perform Clustering using only the Murder and Assault columns .We have selected the
number of clusters as 3.
Here we group different cities into clusters based on the murder rate and assault rate per city.

Lab Assignment 4- Visualisation and Statistics in R.

Here we perform Various Visualization techniques and some statistical calculations.

Code:

# Data definition
data<-c(19, 23, 11, 5, 16, 21, 32, 14, 19, 27, 39,32,20,21)

# Calculate median
median_value <- median(data)
median_value
# Calculate mean
mean_value <- mean(data)
mean_value
#calculating standard deviation
deviation <-sd(data)
deviation

Histogram:
hist(data, xlab = "Ages of Students", col = "brown", border = "black")

Bar plot
# bar plot

H <- c(23,32,28,39,41)
M <- c("John","Julie","Jackie","Sam","Robert")

barplot(H,names.arg=M,xlab="Name",ylab="Age",col="blue",
main="Ages ")
Pie Chart:
# pie chart
x <- c(21, 30, 34, 15)
labels <- c("Chennai", "Hyderabad", "Kerala", "Mumbai")
# Plot the chart.
pie(x,labels,main="Percentage of Unemployment per City")

Lab Assignment 5 -Market Basket Analysis

One of the main applications of Association Rule Mining is Market Basket Analysis , Apriori
algorithm is being used to perform Market Basket Analysis in R.The Groceries dataset is being
used to perform Market Basket Analysis .

Market Basket Analysis (MBA) is a data mining technique used to discover associations
between different items purchased together in transactions. It is widely applied in retail and
e-commerce industries to understand customer purchasing behavior and to optimize product
placement
Code:
library(arules)

# Loading groceries dataset


data("Groceries")
summary(transactions)
inspect(head(transactions))
rules <- apriori(transactions, parameter = list(support = 0.001, confidence = 0.5))

inspect(head(rules))
Output:

Inference:
Support value of 0.003 indicates that cereals and whole milk appear in 0.3 percent of
rows of the dataset .The total transactions are 9835 ,so the given combination will appear 36
times.So we could infer that when a person buys cereals there is a good chance they will also
buy whole milk

Confidence indicates the reliability of association rule; a confidence value like 0.64
indicates that if cereals are bought there is a good chance that whole milk will also be bought
time.
Lab 6 - Decision tree in R
We have used the Iris dataset to plot the decision tree and the rpart library is being used

Code:
# Load the dataset
data(iris)
View(iris)
# Explore the structure of the dataset
str(iris)

# Train the decision tree model


library(rpart)
model <- rpart(data = iris)

# Plot the decision tree


library(rpart.plot)
rpart.plot(model)
Output:

We could identify the type of flower based on the Petal length and Petal width with the help of
the decision tree.

You might also like