CSE 3121 Information Visualization R Studio All Codes
CSE 3121 Information Visualization R Studio All Codes
Name:V.Yuthish Kumar
Regno:21MIA1023
The Apriori algorithm is being used to perform Association Rule mining and the “Adults”
Dataset is being used.
Support measures the frequency of occurrence of a particular itemset in a
dataset.Higher support values indicate that the itemset occurs more frequently in the dataset.
Code:
library(arules)
# Load the Adult dataset
data("Adult")
Output:
Inference:
Support value of 0.01 indicates that (education = Doctorate) and (race=white) appear in
1 percent of rows of the dataset .The total transactions are 48842 ,so the given combination will
appear 493 times. Similarly we can infer various combinations.
Confidence indicates the reliability of association rule; a high confidence value like 0.96
indicates that if (education=5th -6th) appears (capital-loss=None) will also appear most of the
time.
# Visualizing clusters
ggplot(df, aes(x = Murder, y = Assault, color = cluster)) +
geom_point() +
labs(title = "Visualiztion of Clusters on UsArrests Dataset",
x = "Murder Rate",
y = "Assault Rate")
Output:
These are some of the rows on the USArrests Dataset
We perform Clustering using only the Murder and Assault columns .We have selected the
number of clusters as 3.
Here we group different cities into clusters based on the murder rate and assault rate per city.
Code:
# Data definition
data<-c(19, 23, 11, 5, 16, 21, 32, 14, 19, 27, 39,32,20,21)
# Calculate median
median_value <- median(data)
median_value
# Calculate mean
mean_value <- mean(data)
mean_value
#calculating standard deviation
deviation <-sd(data)
deviation
Histogram:
hist(data, xlab = "Ages of Students", col = "brown", border = "black")
Bar plot
# bar plot
H <- c(23,32,28,39,41)
M <- c("John","Julie","Jackie","Sam","Robert")
barplot(H,names.arg=M,xlab="Name",ylab="Age",col="blue",
main="Ages ")
Pie Chart:
# pie chart
x <- c(21, 30, 34, 15)
labels <- c("Chennai", "Hyderabad", "Kerala", "Mumbai")
# Plot the chart.
pie(x,labels,main="Percentage of Unemployment per City")
One of the main applications of Association Rule Mining is Market Basket Analysis , Apriori
algorithm is being used to perform Market Basket Analysis in R.The Groceries dataset is being
used to perform Market Basket Analysis .
Market Basket Analysis (MBA) is a data mining technique used to discover associations
between different items purchased together in transactions. It is widely applied in retail and
e-commerce industries to understand customer purchasing behavior and to optimize product
placement
Code:
library(arules)
inspect(head(rules))
Output:
Inference:
Support value of 0.003 indicates that cereals and whole milk appear in 0.3 percent of
rows of the dataset .The total transactions are 9835 ,so the given combination will appear 36
times.So we could infer that when a person buys cereals there is a good chance they will also
buy whole milk
Confidence indicates the reliability of association rule; a confidence value like 0.64
indicates that if cereals are bought there is a good chance that whole milk will also be bought
time.
Lab 6 - Decision tree in R
We have used the Iris dataset to plot the decision tree and the rpart library is being used
Code:
# Load the dataset
data(iris)
View(iris)
# Explore the structure of the dataset
str(iris)
We could identify the type of flower based on the Petal length and Petal width with the help of
the decision tree.