0% found this document useful (0 votes)
21 views11 pages

Recent Trends in IT Practical Solutions

The document outlines practical assignments in IT focusing on Artificial Intelligence, Data Mining, and Spark programming. It includes various programming tasks such as implementing algorithms like Breadth First Search, Depth First Search, and decision tree classification using Weka and R. Additionally, it covers Spark operations like map, filter, and DataFrame manipulations.

Uploaded by

Ashmit Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views11 pages

Recent Trends in IT Practical Solutions

The document outlines practical assignments in IT focusing on Artificial Intelligence, Data Mining, and Spark programming. It includes various programming tasks such as implementing algorithms like Breadth First Search, Depth First Search, and decision tree classification using Weka and R. Additionally, it covers Spark operations like map, filter, and DataFrame manipulations.

Uploaded by

Ashmit Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Recent Trends In IT

Artificial Intelligence Practical Assignments: (Python / R


programming)

1. Write a program to implement Breadth First Search Algorithm.



from collections import deque
def bfs(graph, start):
visited = set()
queue = deque([start])
visited.add(start)
while queue:
node = queue.popleft()
print(node, end=" ")
for neighbor in graph[node]:
if neighbor not in visited:
visited.add(neighbor)
queue.append(neighbor)
# Example graph and execution
graph = {
'A': ['B', 'C'],
'B': ['A', 'D', 'E'],
'C': ['A', 'F'],
'D': ['B'],
'E': ['B', 'F'],
'F': ['C', 'E']
}
bfs(graph, 'A')

2. Write a program to implement the Depth First Search Algorithm.



def dfs(graph, start, visited=None):
if visited is None:
visited = set()
visited.add(start)
print(start, end=" ")
for neighbor in graph[start]:
if neighbor not in visited:
dfs(graph, neighbor, visited)
# Example graph and execution
graph = {
'A': ['B', 'C'],
'B': ['A', 'D', 'E'],
'C': ['A', 'F'],
'D': ['B'],
'E': ['B', 'F'],
'F': ['C', 'E']
}
dfs(graph, 'A')

3. Write a program to implement a water jug problem.



def water_jug_problem(jug1, jug2, target):
visited = set()
queue = [(0, 0)] # Start with both jugs empty
while queue:
(a, b) = queue.pop(0)
if (a, b) in visited:
continue
print(f"Jug1: {a}, Jug2: {b}")
visited.add((a, b))
if a == target or b == target:
return True
queue.extend([
(jug1, b), # Fill Jug1
(a, jug2), # Fill Jug2
(0, b), # Empty Jug1
(a, 0), # Empty Jug2
(min(a + b, jug1), max(0, a + b - jug1)), # Pour Jug2 -> Jug1
(max(0, a + b - jug2), min(a + b, jug2)) # Pour Jug1 -> Jug2
])
return False
# Example execution
water_jug_problem(4, 3, 2)

4. Write a program to implement the Tower of Hanoi problem.



def tower_of_hanoi(n, source, target, auxiliary):
if n == 1:
print(f"Move disk 1 from {source} to {target}")
return
tower_of_hanoi(n-1, source, auxiliary, target)
print(f"Move disk {n} from {source} to {target}")
tower_of_hanoi(n-1, auxiliary, target, source)
# Example execution
tower_of_hanoi(3, 'A', 'C', 'B')
Data Mining Practical Assignments:

1. Build a classification model and usage of weka to use a Decision Tree algorithm to
classify data from the "weather.arff" file. Perform initial preprocessing and create a
version of the initial dataset in which all numeric attributes should be converted to
categorical data.
→ Steps:
1.​ Open Weka Explorer.
2.​ Load the weather.arff dataset.
3.​ Preprocess:
○​ Convert numeric attributes to categorical: Use the "Discretize" filter in
the "Preprocess" tab.
4.​ Apply Decision Tree:
○​ Go to the "Classify" tab.
○​ Choose J48 under the "Trees" category.
5.​ Evaluate the model (e.g., cross-validation or training set).

Using R:
# Install and load required libraries
library(RWeka)
# Load dataset
weather <- read.arff("weather.arff")
# Preprocess: Convert numeric to categorical
weather$NumericAttribute <- cut(weather$NumericAttribute, breaks = 3, labels =
c("Low", "Medium", "High"))
# Train Decision Tree
model <- J48(Class ~ ., data = weather)
summary(model)

2. Use database "labor.arff" Apply Linear Regression and find out the total number
of instances. (using weka / R).

library(foreign)
# Load dataset
labor <- read.arff("labor.arff")
# Linear Regression
model <- lm(TargetAttribute ~ ., data = labor)
summary(model)
# Count total instances
nrow(labor)
3. Use Naïve Bayes algorithm to diabetes data from the "diabetes arff" file. Perform
initial preprocessing and create a version of the initial data set in which the ID field
should be removed and the "Type" attribute should be converted to categorical
data. (using weka /R)

library(RWeka)
# Load dataset
diabetes <- read.arff("diabetes.arff")
# Preprocess: Remove ID field
diabetes$ID <- NULL
# Convert "Type" to categorical
diabetes$Type <- as.factor(diabetes$Type)
# Train Naïve Bayes
model <- NaiveBayes(Type ~ ., data = diabetes)
summary(model)

4. Build a classification model and usage of WEKA to implement Naïve Bayes


algorithm to classify whether data from the "Iris.arff" file on attribute plat growth
and leaves. Perform initial pre-processing and create a version of the initial dataset
in which all numeric attributes should be converted to categorical data.

Weka:
1.​ Load Iris.arff.
2.​ Preprocess:
○​ Convert numeric attributes to categorical using "Discretize."
3.​ Apply Naïve Bayes under the "Classify" tab.

Using R:
library(RWeka)
# Load dataset
iris <- read.arff("iris.arff")
# Preprocess: Convert numeric attributes to categorical
iris$Sepal.Length <- cut(iris$Sepal.Length, breaks = 3, labels = c("Short", "Medium",
"Long"))
# Train Naïve Bayes
model <- NaiveBayes(Species ~ ., data = iris)
summary(model)
5. Use the Zero R classification algorithm to supermarket data from the
"supermarket.arff" file. Perform initial preprocessing and analyze confusion matrix.
(using weka /R)

library(RWeka)
# Load dataset
supermarket <- read.arff("supermarket.arff")
# Train Zero R
model <- ZeroR(Class ~ ., data = supermarket)
summary(model)

6. Use the simple K-means algorithm for database "bank.arff" with the default
settings and find out final cluster centroids.

Weka:
1.​ Load bank.arff.
2.​ Use "SimpleKMeans" under the "Cluster" tab.
3.​ Analyze the centroids in the output.

Using R:
library(cluster)
# Load dataset
bank <- read.arff("bank.arff")
# K-means clustering
kmeans_result <- kmeans(bank[, -1], centers = 3)
kmeans_result$centers

7. Use Apriori algorithm to vote data of a transaction and identify all frequent
k-itemsets and min support is 40% from the "vote.arff" file. Perform initial
preprocessing and find all the frequent item sets. (using weka /R).

library(arules)
# Load dataset
vote <- read.arff("vote.arff")
# Convert to transactions
transactions <- as(vote, "transactions")
# Apply Apriori
rules <- apriori(transactions, parameter = list(supp = 0.4, conf = 0.8))
inspect(rules)
8. Use the Hierarchical Clustering classification algorithm to tic-tac-toe data from
the "tic-tac-toe.arff" file. Perform initial pre-processing and create a version of the
initial data set in which the ID field should be removed and the "class" attribute
should be converted to categorical data. (Using weka /R).

library(cluster)
# Load dataset
tic_tac_toe <- read.arff("tic-tac-toe.arff")
# Preprocess: Remove ID and convert class to categorical
tic_tac_toe$ID <- NULL
tic_tac_toe$class <- as.factor(tic_tac_toe$class)
# Hierarchical clustering
distance <- dist(tic_tac_toe[, -ncol(tic_tac_toe)])
hclust_result <- hclust(distance)
plot(hclust_result)

9. Build a classification model and usage of weka to use a Decision Tree algorithm to
classify whether data from the "labour.arff" file. Perform initial preprocessing and
create a version of the initial dataset in which all numeric attributes should be
converted to categorical data.

Using Weka

1.​ Load the Dataset:


○​ Open Weka Explorer.
○​ Load the labor.arff dataset from the "Preprocess" tab.
2.​ Preprocess the Dataset:
○​ Convert Numeric Attributes to Categorical:
■​ Use the "Discretize" filter under the "Preprocess" tab:
1.​ Select "Choose" →
filters.unsupervised.attribute.Discretize.
2.​ Apply the filter to all numeric attributes.
3.​ Apply the Decision Tree Algorithm:
○​ Go to the "Classify" tab.
○​ Select J48 from the "trees" algorithms list (this is Weka’s
implementation of the C4.5 decision tree).
○​ Set your class attribute (target variable) if not already set.
○​ Run the classification algorithm.
Using R
# Load required libraries
library(RWeka)
library(foreign) # For loading ARFF files
# Load the dataset
labor <- read.arff("labor.arff")
# Preprocess: Convert numeric attributes to categorical
# (Assume "Age" and "Salary" are numeric attributes in this example)
labor$NumericAttribute1 <- cut(labor$NumericAttribute1, breaks = 3, labels =
c("Low", "Medium", "High"))
labor$NumericAttribute2 <- cut(labor$NumericAttribute2, breaks = 3, labels =
c("Low", "Medium", "High"))
# Train Decision Tree
model <- J48(ClassAttribute ~ ., data = labor)
# Summarize the decision tree
summary(model)
# Visualize the tree (if needed)
plot(model)

10. Use the FP Growth algorithm and Apriori algorithm for database
"supermarket.arff", with the default settings and find out which algorithm
generates maximum rules.
→Weka:
1.​ Load supermarket.arff.
2.​ Use "FP-Growth" and "Apriori" in the "Associate" tab with default settings.
3.​ Compare the number of generated rules.

Using R:
library(arules)
# Load dataset
supermarket <- read.arff("supermarket.arff")
# Convert to transactions
transactions <- as(supermarket, "transactions")
# Apriori
apriori_rules <- apriori(transactions, parameter = list(supp = 0.1, conf = 0.8))
inspect(apriori_rules)
# FP-Growth
library(arulesCBA)
fp_growth_rules <- fpgrowth(transactions, supp = 0.1, conf = 0.8)
inspect(fp_growth_rules)
Spark Practical Assignments:

1. Write a spark program to apply the map function and pass the expression
required to perform.

from pyspark.sql import SparkSession
# Initialize Spark Session
spark = SparkSession.builder.appName("MapFunctionExample").getOrCreate()
sc = spark.sparkContext
# Create an RDD
rdd = sc.parallelize([1, 2, 3, 4, 5])
# Apply map function to square each element
result = rdd.map(lambda x: x ** 2)
# Collect and print results
print(result.collect())

2. Write a spark program to apply a filter function and pass the expression required
to perform.

# Create an RDD
rdd = sc.parallelize([1, 2, 3, 4, 5])
# Apply filter to select only even numbers
result = rdd.filter(lambda x: x % 2 == 0)
# Collect and print results
print(result.collect())

3. Write a spark program to apply an intersection() function to return the


intersection of the elements.

# Create two RDDs
rdd1 = sc.parallelize([1, 2, 3, 4, 5])
rdd2 = sc.parallelize([3, 4, 5, 6, 7])
# Apply intersection
result = rdd1.intersection(rdd2)
# Collect and print results
print(result.collect())
4. Write a spark program to apply reduceByKey() function to aggregate the values.

# Create an RDD with key-value pairs
rdd = sc.parallelize([("a", 1), ("b", 2), ("a", 3), ("b", 4)])
# Apply reduceByKey to aggregate values by key
result = rdd.reduceByKey(lambda x, y: x + y)
# Collect and print results
print(result.collect())

5. Write a spark program to demonstrate the different ways to create a DataFrame.



from pyspark.sql import Row
# Example 1: From RDD
rdd = sc.parallelize([("Alice", 25), ("Bob", 30)])
df1 = spark.createDataFrame(rdd, schema=["Name", "Age"])
# Example 2: From List
data = [("Alice", 25), ("Bob", 30)]
df2 = spark.createDataFrame(data, schema=["Name", "Age"])
# Example 3: From Rows
rows = [Row(Name="Alice", Age=25), Row(Name="Bob", Age=30)]
df3 = spark.createDataFrame(rows)
# Show DataFrames
df1.show()
df2.show()
df3.show()

6. Write a spark program to add or update columns in DataFrame.



# Create a DataFrame
data = [("Alice", 25), ("Bob", 30)]
df = spark.createDataFrame(data, schema=["Name", "Age"])
# Add a new column
df = df.withColumn("NewColumn", df["Age"] + 5)
# Update an existing column
df = df.withColumn("Age", df["Age"] * 2)
# Show the DataFrame
df.show()
7. Write a spark program to remove distinct on multiple selected columns.

# Create a DataFrame
data = [("Alice", 25, "NY"), ("Bob", 30, "CA"), ("Alice", 25, "NY")]
df = spark.createDataFrame(data, schema=["Name", "Age", "City"])
# Remove duplicates based on "Name" and "Age"
df_distinct = df.dropDuplicates(["Name", "Age"])
# Show the DataFrame
df_distinct.show()

You might also like