23CS0903 Artificial Intelligence and Machine Learning Lab Manual R23 CSE CSM
23CS0903 Artificial Intelligence and Machine Learning Lab Manual R23 CSE CSM
PUTTUR
(AUTONOMOUS)
(Approved by AICTE, New Delhi& Affiliated to JNTUA, Ananthapuramu) (Accredited by NBA for EEE,
Mech., ECE & CSE
Accredited by NAAC with ‘A’ Grade)
Puttur -517583, Tirupati District, A.P. (India)
II B.Tech - II Semester
Name : _____________________________________________
Section :____________________________________________
Department of CSE (23CS0903) Artificial Intelligence & Machine Learning Lab
Name of the Lab: ARTIFICIAL INTELLIGENCE & MACHINE LEARNING LAB (23CS0903)
Year & SEM : II B.TECH – II Sem.
COURSE OBJECTIVES
The objectives of this course
1. The student should be made to study the concepts of Artificial Intelligence.
2. The student should be made to learn the methods of solving problems using AI
3. The student should be made to introduce the concepts of Expert Systems and Machine
Learning.
4. To learn about computing central tendency measures and Data pre-processing
techniques
5. To learn about classification and regression algorithms
6. To apply different clustering algorithms for a problem.
List of Experiments
1. Pandas Library
a. Write a python program to implement Pandas Series with labels.
b. Create a Pandas Series from a dictionary.
c. Creating a Pandas Data Frame.
d. Write a program which makes use of the following Pandas methods
i) describe () ii) head () iii) tail () iv) info ()
2. Pandas Library: Visualization
a. Write a program which use pandas inbuilt visualization to plot following graphs:
i) Bar plots ii) Histograms iii) Line plots iv) Scatter plots
3. Write a Program to Implement Breadth First Search using Python.
4. Write a program to implement Best First Searching Algorithm
5. Write a Program to Implement Depth First Search using Python.
6. Write a program to implement the Heuristic Search
7. Write a python program to implement A* and AO* algorithm. (Ex: find the shortest path)
8. Apply the following Pre-processing techniques for a given dataset.
a. Attribute selection
b. Handling Missing Values
c. Discretization
d. Elimination of Outliers
9. Apply KNN algorithm for classification and regression
10. Demonstrate decision tree algorithm for a classification problem and perform parameter
tuning for better results
11. Apply Random Forest algorithm for classification and regression
12. Demonstrate Naïve Bayes Classification algorithm.
13. Apply Support Vector algorithm for classification
14. Implement the K-means algorithm and apply it to the data you selected. Evaluate
performance by measuring the sum of the Euclidean distance of each example from its
class center. Test the performance of the algorithm as a function of the parameters K.
REFERENCES
1. Stuart J. Russell and Peter Norvig, Artificial Intelligence A Modern Approach, 4th
Edition, Pearson, 2020
2. Martin C. Brown (Author), Python: The Complete Reference, McGraw Hill Education,
Fourth edition, 2018
3. R. NageswaraRao , Core Python Programming, Dreamtech Press India Pvt Ltd 2018.
4. Tom M. Mitchell, Machine Learning, McGraw-Hill Publication, 2017
5. Peter Harrington, Machine Learning in Action, DreamTech
6. Pang-Ning Tan, Michel Stenbach, Vipin Kumar, Introduction to Data Mining,7th
Edition, 2019.
INDEX
1. Pandas Library
a. Write a python program to implement Pandas Series
with labels.
Ex.No-1 b. Create a Pandas Series from a dictionary. Date:
c. Creating a Pandas Data Frame.
d. Write a program which makes use of the following
Pandas methods
i) describe () ii) head () iii) tail () iv) info ()
Aim:
a. To write a python program to implement Pandas Series with labels.
b. To create a Pandas Series from a dictionary.
c. To create a Pandas Data Frame.
d. To write a program which makes use of the following Pandas methods
i) describe () ii) head () iii) tail () iv) info ()
Procedure:
Labels
The labels in the Pandas Series are index numbers by default. Like in dataframe
and array, the index number in series starts from 0. Such labels can be used to access a
specified value.
Source Code:
import pandas as pd
# create a list
a = [1, 3, 5]
# create a series and specify labels
my_series = pd.Series(a, index = ["x", "y", "z"])
print(my_series)
Output:
Source Code:
# create a dictionary
grades = {"Semester1": 3.25, "Semester2": 3.28, "Semester3": 3.75}
Output:
DataFrames in Pandas:
Data sets in Pandas are usually multi-dimensional tables, called DataFrames.
Series is like a column, a DataFrame is the whole table.
Source code:
import pandas as pd
data = {
"calories": [420, 380, 390],
"duration": [50, 40, 45]
}
myvar = pd.DataFrame(data)
print(myvar)
Output:
Source Code:
i) describe()
import numpy as np
import pandas as pd
s = pd.Series([2, 3, 4])
s.describe()
Output:
Consider the following code for head, tail and info methods.
import pandas as pd
# Creating a sample DataFrame
data = {'Name': ['Ankit', 'Bhavya', 'Charvi', 'Diya', 'Eesha'],
'Age': [25, 30, 22, 28, 35],
'City': ['New York', 'London', 'Paris', 'Tokyo', 'Sydney']}
df = pd.DataFrame(data)
ii) head()
print(df.head(3))
Output:
iii) tail()
print(df.tail(2))
Output:
iv) info()
print(df.info())
Output:
Result:
Aim:
To write a program which use pandas inbuilt visualization to plot following graphs:
i) Bar plots ii) Histograms iii) Line plots iv) Scatter plots
Source Code:
i) Bar Plots
# importing matplotlib
import matplotlib.pyplot
# importing pandas as pd
import pandas as pd
# importing numpy as np
import numpy as np
# creating a dataframe
df = pd.DataFrame(np.random.rand(10, 3), columns =['a', 'b', 'c'])
df.plot.bar()
Output:
ii) Histograms
plt.close("all")
df4 = pd.DataFrame({
"a": np.random.randn(1000) + 1,
"b": np.random.randn(1000),
"c": np.random.randn(1000) - 1,
}, columns=["a", "b", "c"])
plt.figure();
df4.plot.hist(alpha=0.5);
Output:
import pandas as pd
import matplotlib.pyplot as plt
data = {
'Year': [2015, 2016, 2017, 2018, 2019, 2020],
df = pd.DataFrame(data)
plt.show()
Output:
# Prepare data
data={'Name':['Dhanashri', 'Smita', 'Rutuja',
'Sunita', 'Poonam', 'Srushti'],
'Age':[20, 18, 27, 50, 12, 15]}
Output:
Result:
Aim:
Description:
1. Begin, place any one of the vertices of our graph at the lower extreme of the queue.
2. Add the very first element in the created queue to the list of objects that have already
been checked out.
3. Create a list of all the nodes that seem to be near that vertex. Individual nodes which are
not in the visited list should be moved to the rear of the queue.
4. Repeat the above two steps, i.e., steps 2 and 3, till our queue is reduced to 0.
5. As breadth-first search scans every node of the given graph, a standard BFS algorithm
splits each node or vertex of the tree or graph into two distinct groups.
Visited
Not visited
6. The objective of the technique discussed is to visit each vertex while at the same time
avoiding recurring cycles. BFS starts with a single node, then examines all nodes inside
one distance, then all the other nodes under two distances, and so forth. To retain the
nodes that remained to visit, BFS requires a queue (or, in Python, a list)
Source Code:
graph = {
'5' : ['3','7'],
'3' : ['2', '4'],
'7' : ['8'],
'2' : [],
'4' : ['8'],
'8' : []
}
visited = []
queue = []
while queue:
m = queue.pop(0)
print (m, end = " ")
Output:
Result:
Aim:
To write a program to implement Best First Searching Algorithm
Description:
The algorithm for Best-First Search follows a strategy of exploring the most promising
states first. Here are the key steps of the algorithm:
Source code:
for v, c in graph[u]:
if visited[v] == False:
visited[v] = True
pq.put((c, v))
print()
graph[x].append((y, cost))
graph[y].append((x, cost))
addedge(0, 1, 3)
addedge(0, 2, 6)
addedge(0, 3, 5)
addedge(1, 4, 9)
addedge(1, 5, 8)
addedge(2, 6, 12)
source = 0
target = 9
best_first_search(source, target, v)
Output:
Result:
Aim:
To write a Program to Implement Depth First Search using Python.
Description:
Source code:
graph = {
'5' : ['3','7'],
'3' : ['2', '4'],
'7' : ['8'],
'2' : [],
'4' : ['8'],
'8' : []
}
visited = set()
def dfs(visited, graph, node):
if node not in visited:
print(node, end=" ")
visited.add(node)
for neighbour in graph[node]:
dfs(visited, graph, neighbour)
Output:
Result:
Ex.No-6 Date:
Write a program to implement the Heuristic Search
Aim:
Description:
Source code:
import heapq
graph = {
0: {1: 1, 2: 3},
1: {3: 1},
2: {1: 1, 3: 1},
3: {4: 3},
4:{}
}
start_node = 0
goal_node = 4
path, cost = a_star(graph, start_node, goal_node)
if path:
print("Shortest path:", path)
print("Total cost:", cost)
else:
print("No path found.")
Output:
Result:
Aim:
To write a python program to implement A* and AO* algorithm for finding the shortest
path.
Description:
A* Algorithm:
A* is a popular pathfinding and graph traversal algorithm, often used in games and artificial
intelligence.
1. Initialize the open list with the start node and the closed list as empty.
2. Repeat the following steps until the open list is empty:
1. Remove the node with the lowest f value (where f = g + h) from the open list and
add it to the closed list.
2. Generate all possible successor nodes of the current node.
3. For each successor:
If the successor is the goal node, terminate the search and reconstruct the
path.
If the successor is not in the open list, add it and calculate its g and h
values.
If the successor is in the open list with a higher g value, update its g value
and parent node.
3. Continue until the goal node is found or the open list is empty.
Source Code:
# A* Search Algorithm
import heapq
def heuristic(a, b):
return abs(a - b)
def a_star(graph, start, goal):
open_set = [(0, start)]
came_from = {}
g_score = {node: float('inf') for node in graph}
g_score[start] = 0
f_score = {node: float('inf') for node in graph}
f_score[start] = heuristic(start, goal)
while open_set:
current_f, current = heapq.heappop(open_set)
if current == goal:
path = [current]
while current in came_from:
current = came_from[current]
path.append(current)
return path[::-1], g_score[goal]
for neighbor, cost in graph[current].items():
tentative_g_score = g_score[current] + cost
if tentative_g_score < g_score[neighbor]:
came_from[neighbor] = current
g_score[neighbor] = tentative_g_score
f_score[neighbor] = tentative_g_score + heuristic(neighbor, goal)
if (f_score[neighbor], neighbor) not in open_set:
heapq.heappush(open_set, (f_score[neighbor], neighbor))
return None, None
graph = {
0: {1: 1, 2: 3},
1: {3: 1},
2: {1: 1, 3: 1},
3: {4: 3},
4:{}
}
start_node = 0
goal_node = 4
path, cost = a_star(graph, start_node, goal_node)
if path:
print("Shortest path:", path)
print("Total cost:", cost)
else:
print("No path found.")
Output:
AO* Algorithm:
Source Code:
import heapq
while open_set:
current_f, current = heapq.heappop(open_set)
if current == goal:
path = reconstruct_path(came_from, current)
return path, g_score[goal]
path = [current]
while current in came_from:
current = came_from[current]
path.append(current)
return path[::-1]
graph = {
0: {1: 1, 2: 3},
1: {3: 1},
2: {1: 1, 3: 1},
3: {4: 3},
4: {}
}
start_node = 0
goal_node = 4
if path:
print("Shortest path:", path)
print("Total cost:", cost)
else:
print("No path found.")
Output:
Result:
dataset.
a. Attribute selection
b. Handling Missing Values
c. Discretization
d. Elimination of Outliers
Aim:
Description:
import pandas as pd
import numpy as np
Output:
Source Code:
Output:
Output:
num_bins = 3
df_filled['A_discretized'] = pd.cut(df_filled['A'], bins=num_bins, labels=False)
print("\nDiscretized 'A' column:\n", df_filled)
Output:
d. Elimination of Outliers (Example: Remove outliers based on IQR for column 'C')
Q1 = df_filled['C'].quantile(0.25)
Q3 = df_filled['C'].quantile(0.75)
IQR = Q3 - Q1
lower_bound = Q1 - 1.5 * IQR
upper_bound = Q3 + 1.5 * IQR
df_no_outliers = df_filled[~((df_filled['C'] < lower_bound) | (df_filled['C'] > upper_bound))]
print("\nDataframe after removing outliers:\n", df_no_outliers)
Output:
Result:
Ex.No-9 Date:
Apply KNN algorithm for classification and regression
Aim:
Description:
Steps:
1. Load the dataset.
2. Preprocess the data (handle missing values, encode categorical variables, etc.).
3. Split the dataset into training and testing sets.
4. Train the k-NN model.
5. Make predictions and evaluate the model.
Source Code:
iris = load_iris()
X, y = iris.data, iris.target
knn = KNeighborsClassifier(n_neighbors=3)
knn.fit(X_train, y_train)
y_pred = knn.predict(X_test)
Output:
Steps:
1. Load the dataset.
2. Preprocess the data (handle missing values, encode categorical variables, etc.).
3. Split the dataset into training and testing sets.
4. Train the k-NN model.
5. Make predictions and evaluate the model.
Source Code:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsRegressor
from sklearn.metrics import mean_squared_error, r2_score
knn_regressor = KNeighborsRegressor(n_neighbors=3)
knn_regressor.fit(X_train, y_train)
y_pred = knn_regressor.predict(X_test)
Output:
Result:
Aim:
Description:
Source Code:
iris = load_iris()
X, y = iris.data, iris.target
dt_classifier = DecisionTreeClassifier()
param_grid = {
'criterion': ['gini', 'entropy'],
'max_depth': [None, 10, 20, 30],
'min_samples_split': [2, 5, 10],
'min_samples_leaf': [1, 2, 4]
}
grid_search.fit(X_train, y_train)
best_dt_classifier = grid_search.best_estimator_
print("Best parameters:", grid_search.best_params_)
y_pred = best_dt_classifier.predict(X_test)
OUTPUT:
RESULT:
Aim:
Description:
Source Code:
iris = load_iris()
X, y = iris.data, iris.target
Output:
Source Code:
iris = load_iris()
X, y = iris.data, iris.target
Output:
Result:
Ex.No-12 Date:
Demonstrate Naïve Bayes Classification algorithm.
Aim:
To demonstrate Naïve Bayes Classification algorithm.
Description:
The Naive Bayes Classifier is a machine learning model used for classification tasks. It is
based on Bayes' Theorem, which calculates the probability of an event based on prior knowledge
of conditions related to the event. The classifier assumes that all features (attributes) are
independent of each other, which is why it's called "naive."
Despite this naive assumption, the Naive Bayes Classifier performs well in many
situations, especially in text classification tasks like spam detection and sentiment analysis. It is
known for its simplicity and efficiency.
Source code:
iris = load_iris()
X, y = iris.data, iris.target
gnb = GaussianNB()
y_pred = gnb.fit(X_train, y_train).predict(X_test)
Output:
Result:
Ex.No-13 Date:
Apply Support Vector algorithm for classification.
Aim:
Description:
Support Vector Machine (SVM) is a classification algorithm that works by finding the best
hyperplane that separates data points of different classes.
1. Hyperplane: A decision boundary that separates different classes.
2. Support Vectors: Data points closest to the hyperplane, crucial for defining its position.
3. Margin: The gap between the hyperplane and the nearest data points from each class,
which the SVM aims to maximize.
SVMs can handle both linear and non-linear data using something called the kernel trick.
They're especially effective in high-dimensional spaces.
Steps:
1. Import Libraries
2. Load the Dataset
3. Split the Dataset
4. Train the Support Vector Machine Model
5. Make Predictions
6. Evaluate the Model
Source code:
X = [[1, 2], [2, 3], [3, 1], [4, 3], [5, 3], [6, 2]]
y = [0, 0, 0, 1, 1, 1]
svm_classifier = SVC(kernel='linear')
svm_classifier.fit(X_train, y_train)
y_pred = svm_classifier.predict(X_test)
Output:
Result:
Aim:
To implement the K-means algorithm and apply it to the data you selected, to evaluate
performance by measuring the sum of the Euclidean distance of each example from its class
center and to test the performance of the algorithm as a function of the parameters K.
Description:
K-means Algorithm:
The K-Means Algorithm is a clustering technique that partitions data into K clusters. It
assigns data points to the nearest cluster center (centroid) and iteratively refines the cluster
centers until convergence. The sum of these distances gives you a measure of how well the data
points are clustered.
Source code:
Output:
Result: