0% found this document useful (0 votes)
44 views38 pages

Artificial Intelligence 3171105 Lab Manual

Uploaded by

somatev819
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views38 pages

Artificial Intelligence 3171105 Lab Manual

Uploaded by

somatev819
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

3171105 AI

Artificial Intelligence
(3171105)

Lab Manual

Semester - 7

Electronics and Communication


Department
Government Engineering College,
Bharuch
1
3171105 AI

CERTIFICATE

This is to certify that Mr./Ms. ________________________

of B.E. ______ semester enrollment no _________________

Brach ______________________ has been found satisfactory

in the continuous internal evaluation of the laboratory, practical and

term work in the subject ________________________ for the

academic year 20___-20___.

Date: Sign of Faculty

2
3171105 AI

Sr Experiment Date Page Sign


No

1
Understand Google Colab and get familiar with it.

2
To implement basic operations using the Pandas and NumPy
libraries for data manipulation, including grouping, filtering,
sorting, and merging operations.

3
To implement Depth-First Search (DFS) and Breadth-First
Search (BFS) algorithms for traversing graphs using Python.

4
To implement the A* algorithm using Python's NetworkX
library for finding the shortest path in a graph.

5
To implement the Hill Climbing algorithm for solving a
scheduling problem where classes, professors, and rooms must
be allocated based on specific constraints.

6
To implement a Genetic Algorithm (GA) for optimizing a
dietary plan, showcasing how GA overcomes the limitations of
Hill Climbing in terms of finding global optima.

7
To implement a Natural Language Processing (NLP) task using
Python's NLTK library to filter out stop words from a given
sentence.

8
To implement word stemming using Python's NLTK library,
specifically using the Porter Stemmer to reduce words to their
root forms.

9
To implement lemmatization using Python's NLTK library to
convert words into their base or dictionary form (lemma).

10
To implement the Bag of Words model to represent text data as
a matrix of word frequencies using Python's sklearn library.

11
Demonstration of Image classification using Deep Learning.

3
3171105 AI

Practical 1
Aim:
Understand Google Colab and get familiar with it.

Theory: Many times the students may not be equipped with the systems having
latest hardware configurations, as desired for the installation of Python, or there
might be compatibility issues with the operating system or may be due to any
reason you are not able to install and start your work with Python. Under such
circumstances the solution is Google Colab Notebook
(https://colab.research.google.com /notebooks/ welcome.ipynb), use this and just
login with your gmail account and start your work on Jupyter Notebook, as simple
as that.

Google provides a free cloud service


based on Jupyter Notebooks that
supports free GPU. Not only is this a
great tool for improving your coding
skills, but it also allows absolutely
anyone to develop deep learning
applications using popular libraries
such as PyTorch, TensorFlow, Keras,
and OpenCV. Colab provides GPU
and it’s totally free.

There are some limits. It supports Python 2.7 and 3.6. There is a limit to your
sessions and size. You can create notebooks in Colab, upload notebooks, store
notebooks, share notebooks, mount your Google Drive and use whatever you’ve
got stored in there, import most of your favorite directories, upload your personal
Jupyter Notebooks, upload notebooks directly from GitHub, upload Kaggle files,
download your notebooks, and do just about everything else that you might want
to be able to do.

Working in Google Colab is easy, but it hasn’t been without a couple of small
challenges.

4
3171105 AI

Setting up your drive

Create a folder for your notebooks. This step isn’t


totally necessary if you want to just start working in
Colab. However, since Colab is working off of your
drive, it’s a good idea to specify the folder where we
want to work.

We can do that by going to Google Drive and clicking


“New” and then creating a new folder. If you want,
while you’re already in your Google Drive you can
create a new Colab notebook. Just click “New” and
drop the menu down to “More” and then select
“Colaboratory.” Otherwise, you can always go
directly to Google Colab.

You can rename your notebook by clicking on the


name of the notebook and changing it or by dropping
the “File” menu down to “Rename.”

Set up your free GPU

First go to the “runtime” dropdown menu, select


“change runtime type” and select GPU in the
hardware accelerator drop-down menu. Just click the
file option and select the new workbook, and the new
Jupyter notebook will open in Google Colab, now you
may start your work.

5
3171105 AI

Create New Notebook: New window as shown below will pop up.

Step 2: Click on +code and the new cell will


be open. Here you can write your code.
Step 3: After writing the code click on
connect (top right side) as shown in below
figure to connect to the cloud server. Then
run the cell using the play button or by
pressing shift+enter.

Step 4: It will display the output.

Conclusion:

6
3171105 AI

Practical 2

Aim:
To implement basic operations using the Pandas and NumPy libraries for
data manipulation, including grouping, filtering, sorting, and merging
operations.

Theory:
Pandas and NumPy are foundational libraries in Python, widely used in data manipulation and
analysis, especially in the context of AI and machine learning. NumPy provides support for
multi-dimensional arrays and mathematical operations, allowing fast numerical computation.
Pandas, on the other hand, offers powerful tools for working with tabular data, using
DataFrames to handle and manipulate structured datasets efficiently.
The importance of these libraries lies in their ability to handle large datasets, a common
requirement in AI, where preparing data before feeding it into models is crucial. Operations
such as sorting, filtering, and merging enable users to clean, transform, and organize data in
preparation for machine learning algorithms. These preprocessing tasks are essential for
ensuring that AI models can effectively learn patterns from data.
In this practical session, students will learn how to perform these key operations using NumPy
and Pandas. The goal is to get students familiar with fundamental data handling concepts,
which will be crucial as they move forward in more advanced AI applications, including feature
engineering and data visualization.

Code:
# Importing necessary libraries
import numpy as np
import pandas as pd

# NumPy operations: sorting and filtering


arr = np.array([3, 6, 1, 9, 2, 5])
sorted_arr = np.sort(arr)
print("Sorted Array:", sorted_arr)

filtered_arr = arr[arr > 4]


print("Filtered Array:", filtered_arr)

# Pandas DataFrame creation

7
3171105 AI

data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],


'Age': [24, 27, 22, 32]}
df = pd.DataFrame(data)
print(df)

# Sorting the DataFrame by 'Age'


sorted_df = df.sort_values(by='Age')
print("\nSorted DataFrame by Age:\n", sorted_df)

# Merging DataFrames
data1 = {'Key': ['K0', 'K1', 'K2', 'K3'], 'Name': ['Parth',
'Ketan', 'Trupal', 'Nil']}
data2 = {'Key': ['K0', 'K1', 'K2', 'K3'], 'Town': ['Nagpur',
'Kanpur', 'Allahabad', 'Kannauj'], 'Qualification': ['B.tech',
'B.A', 'B.Com', 'B.Hons']}
df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)
merged_df = pd.merge(df1, df2, on='Key')
print("\nMerged DataFrame:\n", merged_df)

Output:

Sorted Array: [1 2 3 5 6 9]
Filtered Array: [6 9 5]

Name Age
0 Alice 24
1 Bob 27
2 Charlie 22
3 David 32

Sorted DataFrame by Age:


Name Age
2 Charlie 22
0 Alice 24
1 Bob 27

8
3171105 AI

3 David 32

Merged DataFrame:
Key Name Town Qualification
0 K0 Parth Nagpur B.tech
1 K1 Ketan Kanpur B.A
2 K2 Trupal Allahabad B.Com
3 K3 Nil Kannauj B.Hons

Conclusion:

9
3171105 AI

Practical 3
Aim:
To implement Depth-First Search (DFS) and Breadth-First Search (BFS)
algorithms for traversing graphs using Python.

Theory:
DFS and BFS are fundamental algorithms used for traversing or searching tree or graph data
structures. These algorithms are essential for exploring all possible nodes in a graph, which is
an important task in AI, search problems, and network analysis.
DFS explores a graph by going as deep as possible into the graph before backtracking,
following one path fully before exploring others. It is suitable for scenarios where the entire
search space needs to be explored.
On the other hand, BFS explores all the neighbors at the present depth level before moving on
to nodes at the next depth level. It is useful for finding the shortest path in an unweighted graph.
These graph traversal techniques are critical in AI for problem-solving, pathfinding algorithms,
and game AI, where understanding the structure of interconnected nodes (like a map or
network) is necessary. By learning these algorithms, students will gain foundational skills for
implementing more complex search algorithms in AI.

Problem:

Code:
Depth-First Search (DFS)
# Depth-First Search (DFS) Algorithm
graph = {

10
3171105 AI

'Vadodara': ['Bharuch', 'Rajpipla'],


'Bharuch': ['Vadodara', 'Surat', 'Rajpipla'],
'Rajpipla': ['Vadodara', 'Bharuch', 'Surat', 'Vyara'],
'Surat': ['Bharuch', 'Rajpipla', 'Vyara', 'Navsari'],
'Navsari': ['Surat', 'Vyara', 'Valsad'],
'Vyara': ['Surat', 'Rajpipla', 'Navsari', 'Ahwa'],
'Valsad': ['Navsari', 'Ahwa'],
'Ahwa': ['Vyara', 'Valsad'],
}
visited = set()
def dfs(visited, graph, node):
if node not in visited:
print(node)
visited.add(node)
for neighbour in graph[node]:
dfs(visited, graph, neighbour)

# Driver Code
print("Following is the Depth-First Search")
dfs(visited, graph, 'Vadodara')

Breadth-First Search (BFS)


# Breadth-First Search (BFS) Algorithm
graph = {
'Vadodara': ['Bharuch', 'Rajpipla'],
'Bharuch': ['Vadodara', 'Surat', 'Rajpipla'],
'Rajpipla': ['Vadodara', 'Bharuch', 'Surat', 'Vyara'],
'Surat': ['Bharuch', 'Rajpipla', 'Vyara', 'Navsari'],
'Navsari': ['Surat', 'Vyara', 'Valsad'],
'Vyara': ['Surat', 'Rajpipla', 'Navsari', 'Ahwa'],
'Valsad': ['Navsari', 'Ahwa'],
'Ahwa': ['Vyara', 'Valsad'],
}

visited = [] # List for visited nodes

11
3171105 AI

queue = [] # Initialize a queue


def bfs(visited, graph, node):
visited.append(node)
queue.append(node)

while queue:
m = queue.pop(0)
print(m, end=" ")
for neighbour in graph[m]:
if neighbour not in visited:
visited.append(neighbour)
queue.append(neighbour)

# Driver Code
print("Following is the Breadth-First Search")
bfs(visited, graph, 'Vadodara')

Output:
DFS Output:
Following is the Depth-First Search
Vadodara
Bharuch
Surat
Navsari
Valsad
Ahwa
Vyara
Rajpipla

BFS Output:
Following is the Breadth-First Search
Vadodara Bharuch Rajpipla Surat Vyara Navsari Ahwa Valsad

Conclusion:

12
3171105 AI

Practical 4
Aim:
To implement the A* algorithm using Python's NetworkX library for finding
the shortest path in a graph.
Theory:
The A* algorithm is a popular pathfinding and graph traversal algorithm used in AI, especially
in applications like robotics, gaming, and route planning. It combines the concepts of Dijkstra's
algorithm and a heuristic-based search.
A* searches for the shortest path from a given start node to a target node by evaluating the sum
of two factors:
The actual cost to reach the current node from the start node (often denoted as g(n)).
The estimated cost to reach the goal from the current node using a heuristic function (denoted
as h(n)).
The combination of these two values helps the algorithm efficiently find the optimal path by
balancing exploration (g(n)) and targeted search (h(n)).
In this session, we use the NetworkX library to define a graph, assign edge weights, and apply
the A* algorithm to find the shortest path between two nodes. We also define a heuristic
function based on the expected distance to the target.

Problem:

Code:
# Importing necessary libraries
import networkx as nx

# Define the graph using networkx


G = nx.Graph()

13
3171105 AI

# Add edges and their weights


edges = [
('A', 'B', 6),
('A', 'F', 3),
('B', 'C', 3),
('B', 'D', 2),
('C', 'D', 1),
('C', 'E', 5),
('D', 'E', 8),
('E', 'J', 5),
('E', 'I', 5),
('F', 'G', 1),
('F', 'H', 7),
('G', 'I', 3),
('H', 'I', 2),
('I', 'J', 3)
]

# Add edges to the graph


for edge in edges:
G.add_edge(edge[0], edge[1], weight=edge[2])

# Define the heuristic function (estimating the distance to the


target node)
heuristic = {
'A': 10,
'B': 5,
'C': 5,
'D': 1,
'E': 3,
'F': 6,
'G': 5,
'H': 4,
'I': 1,

14
3171105 AI

'J': 0
}

def heuristic_func(n, target):


return heuristic[n]

# Find the shortest path using A* algorithm


path = nx.astar_path(G, 'A', 'J', heuristic=heuristic_func)
print('Path found:', path)

Output:

Path found: ['A', 'F', 'G', 'I', 'J']

Conclusion:

15
3171105 AI

Practical 5
Aim:
To implement the Hill Climbing algorithm for solving a scheduling problem
where classes, professors, and rooms must be allocated based on specific
constraints.

Theory:
The Hill Climbing algorithm is a local search optimization technique that iteratively improves
a single solution by moving to a neighboring state that has a better evaluation score. It is well-
suited for large search spaces where exploring all possible solutions (like with A*) would be
too costly. Hill Climbing is widely used in AI for optimization tasks, such as scheduling and
resource allocation.
In this session, we will implement Hill Climbing to schedule classes at a college, where we
need to assign classes to time slots and rooms, while taking professor preferences into account.
Unlike exhaustive search algorithms (e.g., A*), Hill Climbing seeks to find a "good enough"
solution efficiently, though it may not always find the global optimal solution.

Code:
# Class Timetable Problem using Hill Climbing
import random

# Initial timetable setup


initial_timetable = {
'classes': ['AI', 'DSP', 'ML', 'WC', 'DIVP', 'IOT'],
'professors': ['Prof. A', 'Prof. B', 'Prof. C', 'Prof. D',
'Prof. E', 'Prof. F'],
'rooms': ['R1', 'R2', 'R3'],
'timeslots': ['9:00', '10:00', '11:00', '12:00', '2:00',
'3:00']
}

# Example constraints: Professors' preferences


professor_preferences = {
'Prof. A': 'AI',
'Prof. B': 'DSP',
'Prof. C': 'ML',

16
3171105 AI

'Prof. D': 'WC',


'Prof. E': 'DIVP',
'Prof. F': 'IOT'
}

# Example dissatisfaction calculation function


def calculate_dissatisfaction(timetable):
dissatisfaction = 0
for i, cls in enumerate(timetable['classes']):
professor = timetable['professors'][i]
if professor_preferences[professor] != cls:
dissatisfaction += 1 # Higher value means more
dissatisfaction
return dissatisfaction

# Hill Climbing Algorithm


def hill_climbing(initial_timetable):
current_timetable = initial_timetable.copy()
current_dissatisfaction =
calculate_dissatisfaction(current_timetable)

while True:
neighbors = []

# Generate neighbors by swapping classes


for i in range(len(current_timetable['classes'])):
for j in range(i + 1,
len(current_timetable['classes'])):
neighbor = current_timetable.copy()
neighbor['classes'][i], neighbor['classes'][j]
= neighbor['classes'][j], neighbor['classes'][i]
neighbors.append(neighbor)

next_timetable = min(neighbors,
key=calculate_dissatisfaction)

17
3171105 AI

next_dissatisfaction =
calculate_dissatisfaction(next_timetable)

if next_dissatisfaction >= current_dissatisfaction:


break

current_timetable = next_timetable
current_dissatisfaction = next_dissatisfaction

return current_timetable

# Displaying the results


def print_timetable(timetable):
print("Class Schedule:")
for i in range(len(timetable['classes'])):
print(f"{timetable['classes'][i]} at
{timetable['timeslots'][i]} in {timetable['rooms'][i]}")

optimal_timetable = hill_climbing(initial_timetable)
print("Optimal Timetable using Hill Climbing:")
print_timetable(optimal_timetable)

Output:

Optimal Timetable using Hill Climbing:


Class Schedule:
ML at 2:00 in R3
AI at 3:00 in R1
DSP at 9:00 in R2
DIVP at 11:00 in R3
WC at 12:00 in R3
IOT at 10:00 in R2

Conclusion:

18
3171105 AI

Practical 6
Aim:
To implement a Genetic Algorithm (GA) for optimizing a dietary plan,
showcasing how GA overcomes the limitations of Hill Climbing in terms of
finding global optima.

Theory:
Hill Climbing is a local search algorithm that iteratively moves towards the best neighboring
solution. However, it often gets stuck in local optima and lacks the ability to explore a wider
solution space. This limitation makes it inefficient for problems with complex landscapes.
Genetic Algorithms (GA), inspired by natural selection, offer a more robust alternative. Unlike
Hill Climbing, GA works with a population of solutions and applies operations such as
selection, crossover, and mutation. These operations allow GAs to explore a broader solution
space, making them more likely to find the global optimum.
In this practical, we will use a Genetic Algorithm to find the optimal vegetarian diet plan that
meets nutritional requirements while minimizing the cost.

Code:
import pygad

# Food items with cost and nutrition (per unit)


food_items = [
{"name": "Apple", "cost": 50, "calories": 52, "protein": 0.3,
"fat": 0.2},
{"name": "Banana", "cost": 30, "calories": 96, "protein": 1.3,
"fat": 0.3},
{"name": "Rice", "cost": 40, "calories": 130, "protein": 2.7,
"fat": 0.3},
{"name": "Almonds", "cost": 90, "calories": 575, "protein": 21,
"fat": 49},
{"name": "Oats", "cost": 80, "calories": 389, "protein": 17,
"fat": 7},
]

# Daily nutritional requirements (e.g., 2000 kcal, 50g protein, <70g


fat)

19
3171105 AI

daily_requirements = {"calories": 2000, "protein": 50, "fat": 70}

# Fitness function to evaluate solutions


def fitness_func(solution, solution_idx):
total_cost = sum(food_items[i]['cost'] * solution[i] for i in
range(len(food_items)))
total_calories = sum(food_items[i]['calories'] * solution[i] for
i in range(len(food_items)))
total_protein = sum(food_items[i]['protein'] * solution[i] for i
in range(len(food_items)))
total_fat = sum(food_items[i]['fat'] * solution[i] for i in
range(len(food_items)))

calorie_penalty = abs(daily_requirements['calories'] -
total_calories)
protein_penalty = max(0, daily_requirements['protein'] -
total_protein)
fat_penalty = max(0, total_fat - daily_requirements['fat'])

penalties = calorie_penalty + protein_penalty + fat_penalty


fitness = -(total_cost + penalties) # Negative because PyGAD
maximizes fitness
return fitness

# Genetic Algorithm
ga_instance = pygad.GA(
num_generations=20,
num_parents_mating=4,
fitness_func=fitness_func,
sol_per_pop=10,
num_genes=len(food_items),
gene_type=int,
gene_space=range(0, 6), # Each food item can be selected 0-5
times
parent_selection_type="tournament",
crossover_type="single_point",
mutation_type="random",

20
3171105 AI

mutation_percent_genes=20,
)

ga_instance.run()

# Displaying the optimal solution


solution, solution_fitness, solution_idx =
ga_instance.best_solution()
print("Optimal Diet Plan:")
for i, quantity in enumerate(solution):
if quantity > 0:
print(f"{quantity} x {food_items[i]['name']}")
total_cost = sum(food_items[i]['cost'] * solution[i] for i in
range(len(food_items)))
print(f"Total Cost: ₹{total_cost:.2f}")

Output:

Optimal Diet Plan:


4 x Apple
2 x Banana
1 x Rice
4 x Almonds
4 x Oats
Total Cost: ₹915.00

Conclusion:

21
3171105 AI

Practical 7
Aim:
To implement a Natural Language Processing (NLP) task using Python's
NLTK library to filter out stop words from a given sentence.

Theory:
Natural Language Processing (NLP) involves analyzing and processing human language data.
One common preprocessing task in NLP is stop word removal. Stop words are commonly used
words (such as "the", "is", "in") that usually don't contribute much to the meaning of a sentence
and can be removed to focus on more meaningful content.
In this practical, we will use the NLTK (Natural Language Toolkit) library to:
Tokenize the input sentence into words.
Remove stop words from the tokenized sentence using NLTK's pre-defined stop word list.
This technique is particularly useful in text analysis, sentiment analysis, and other NLP tasks
where reducing noise from the data improves processing efficiency and model accuracy.

Code:
# Importing necessary NLTK libraries
import nltk
nltk.download('punkt')
nltk.download('stopwords')

from nltk.corpus import stopwords


from nltk.tokenize import word_tokenize

# Input Sentence
sentence = """This is a sample sentence, showing off the stop words
filtration."""

# Add StopWords to the variable from NLTK library


stop_words = set(stopwords.words('english'))
word_tokens = word_tokenize(sentence)

# Filter out stop words

22
3171105 AI

filtered_sentence = [w for w in word_tokens if not w.lower() in


stop_words]

# Alternative method for stop word removal


filtered_sentence = []
for w in word_tokens:
if w not in stop_words:
filtered_sentence.append(w)

# Output the tokenized and filtered words


print("Original Sentence")
print(word_tokens)
print("Filtered Sentence")
print(filtered_sentence)

# Output as a sentence
print("\nOriginal Sentence")
print(" ".join(word_tokens))
print("Filtered Sentence")
print(" ".join(filtered_sentence))

Output:
Original Sentence
['This', 'is', 'a', 'sample', 'sentence', ',', 'showing', 'off',
'the', 'stop', 'words', 'filtration', '.']
Filtered Sentence
['This', 'sample', 'sentence', ',', 'showing', 'stop', 'words',
'filtration', '.']

Original Sentence
This is a sample sentence , showing off the stop words filtration .
Filtered Sentence
This sample sentence , showing stop words filtration .

Conclusion:

23
3171105 AI

Practical 8
Aim:
To implement word stemming using Python's NLTK library, specifically
using the Porter Stemmer to reduce words to their root forms.
Theory:
Stemming is a common Natural Language Processing (NLP) task that reduces words to their
base or root form. The goal of stemming is to simplify the text for easier analysis by grouping
similar words with the same root (e.g., "running" becomes "run"). Stemming is useful in search
engines, text analysis, and machine learning tasks where variations of the same word need to
be treated as the same.
In this practical, we will use the Porter Stemmer, which is one of the most widely used
stemming algorithms. It operates by removing common morphological and inflexional endings
from words in English.

Code:
# Importing necessary libraries for stemming
from nltk.stem import PorterStemmer
from nltk.tokenize import word_tokenize

# Initialize the Porter Stemmer


ps = PorterStemmer()

# Input sentence
sentence = "The coming era of Artificial Intelligence will not be the
era of war, but be the era of deep compassion, non-violence, and
love."

# Tokenize the sentence into words


words = word_tokenize(sentence)

# Apply stemming to each word and print the results


for w in words:
print(w, " : ", ps.stem(w))

Output:
The : the

24
3171105 AI

coming : come
era : era
of : of
Artificial : artifici
Intelligence : intellig
will : will
not : not
be : be
the : the
era : era
of : of
war : war
, : ,
but : but
be : be
the : the
era : era
of : of
deep : deep
compassion : compass
non-violence : non-violenc
, : ,
and : and
love : love
. : .

Conclusion:

25
3171105 AI

Practical 9
Aim:
To implement lemmatization using Python's NLTK library to convert words
into their base or dictionary form (lemma).

Theory:
Lemmatization is the process of reducing words to their base or dictionary form (lemma) based
on their meaning and grammatical context. Unlike stemming, which crudely removes suffixes
to shorten words, lemmatization considers the word's part of speech and converts it to its proper
form as found in a dictionary. This makes lemmatization more accurate than stemming when
it comes to preserving the meaning of words.
• Stemming often removes word suffixes to generate a root form, but this root form may
not be a valid word (e.g., "running" becomes "run" or "runn"). Stemming focuses purely
on removing affixes without considering the word's meaning.
• Lemmatization, on the other hand, uses vocabulary and morphological analysis to
return the proper root form of a word based on its context. For example, "better" is
lemmatized to "good", while stemming may not capture such semantic information.
Lemmatization is often preferred over stemming in applications where the meaning of words
needs to be retained, such as in text classification or machine translation.

Code:
# Importing necessary libraries for lemmatization
from nltk.stem import WordNetLemmatizer
from nltk.tokenize import word_tokenize
import nltk
nltk.download('punkt')
nltk.download('wordnet')
nltk.download('omw-1.4')

# Initialize the WordNet Lemmatizer


lemmatizer = WordNetLemmatizer()

# Input sentence
sentence = "The cats are playing with the balls in the garden."

26
3171105 AI

# Tokenize the sentence into words


words = word_tokenize(sentence)

# Apply lemmatization to each word and print the results


print("Original Word : Lemmatized Word")
for w in words:
print(w, " : ", lemmatizer.lemmatize(w))

Output:

Original Word : Lemmatized Word


The : The
cats : cat
are : are
playing : playing
with : with
the : the
balls : ball
in : in
the : the
garden : garden

Conclusion:

27
3171105 AI

Practical 10
Aim:
To implement the Bag of Words model to represent text data as a matrix of
word frequencies using Python's sklearn library.

Theory:
The Bag of Words (BoW) model is a popular and simple technique used in Natural Language
Processing (NLP) for converting text into numerical representations. BoW creates a matrix
where each row represents a sentence or document, and each column represents a unique word
in the corpus. The value in each cell corresponds to the frequency of the word in the
sentence/document. This model ignores grammar and word order, focusing solely on word
frequency, which is useful for machine learning algorithms that work with numerical data.
BoW is often used in text classification, sentiment analysis, and other NLP tasks.

Code:
# Importing necessary libraries
from sklearn.feature_extraction.text import CountVectorizer

# Sample text data


documents = [
"Artificial intelligence is the future of technology.",
"Machine learning is a subset of artificial intelligence.",
"Deep learning advances the field of machine learning.",
]

# Initialize the CountVectorizer


vectorizer = CountVectorizer()

# Fit the model and transform the documents into a word frequency
matrix
X = vectorizer.fit_transform(documents)

# Display the Bag of Words matrix


print("Bag of Words Matrix:")
print(X.toarray())

28
3171105 AI

# Display the feature names (words)


print("\nFeature Names (Words):")
print(vectorizer.get_feature_names_out())

Output:

Bag of Words Matrix:


[[1 0 0 0 1 1 1 0 0 1 1 1]
[1 1 0 0 1 1 1 0 0 1 1 1]
[0 0 1 1 0 1 0 1 1 0 0 0]]

Feature Names (Words):


['advances' 'artificial' 'deep' 'field' 'future' 'intelligence' 'is'
'learning' 'machine' 'of' 'subset' 'technology']

Conclusion:

29
3171105 AI

Practical 11
Aim:
Demonstration of Image classification using Deep Learning.

Theory:
Mankind is a great natural machine and is capable of looking at multiple images every second
and processing them without realising how the processing is done. But the same is not with
machines.
The first step in image processing is to understand how to represent an image so that the
machine can read it?
Every image is an cumulative arrangement of dots (a pixel) arranged in a special order. If you
change the order or colour of a pixel, the image would change as well.

Three basic components to define a basic convolutional neural network.


The Convolutional Layer, Pooling layer, Output layer, etc.

The Convolutional Layer :

In this layer if we have an image of size 66. We define a weight matrix which extracts certain
features from the images
We have initialised the weight
as a 33 matrix. This weight
shall now run across the image
such that all the pixels are
covered at least once, to give a
convolved output. The value
429 above, is obtained by
adding the values obtained by
element wise multiplication of the weight matrix and the highlighted 33 part of the input image.

30
3171105 AI

The 66 image is now converted


into a 44 image. Think of a
weight matrix like a paint brush
painting a wall. The brush first
paints the wall horizontally and
then comes down and paints the
next row horizontally. Pixel
values are used again when the
weight matrix moves along the
image. This basically enables parameter sharing in a convolutional neural network. Let’s see
what this looks like in a real image.

● The weight matrix behaves like a filter in an image, extracting particular information
from the original image matrix.
● A weight combination might be extracting edges, while another one might be a
particular colour, while another one might just blur the unwanted noise.
● The weights are learnt such that the loss function is minimised and extract features from
the original image which help the network in correct prediction.
● When we use multiple convolutional layers, the initial layer extracts more generic
features,and as the network gets deeper the features get complex.

Let us understand some concepts here before we go further deep

What is Stride?

As shown above, the filter or the weight matrix we moved across the entire image moving one
pixel at a time.If this is a hyperparameter to move weight matrix 1 pixel at a time across the
image it is called as stride of 1. Let us see for stride 2 how it looks.

As you can see the size of the image keeps on


reducing as we increase the stride value.

31
3171105 AI

Padding the input image with zeros across it solves this


problem for us. We can also add more than one layer of
zeros around the image in case of higher stride values.
We can see how the initial shape of the image is retained
after we padded the image with a zero. This is known as
same padding since the output image has the same size as
the input.

This is known as same padding


(which means that we considered
only the valid pixels of the input
image). The middle 4*4 pixels
would be the same. Here we have
retained more information from
the borders and have also
preserved the size of the image.

Having Multiple filters & the Activation Map

● The depth dimension of the weight would be the same as the depth dimension of the
input image.
● The weight extends to the entire depth of the input image.
● Convolution with a single weight matrix would result in a convolved output with a
single depth dimension. In the case of multiple filters all have the same dimensions
applied together.
● The output from each filter is stacked together forming the depth dimension of the
convolved image.
● Suppose we have an input image of size 32x32x3. And we apply 10 filters of size 5x5x3
with valid padding. The output would have the dimensions as 28x28x10.

You can visualise it as:


This activation map is the
output of the convolution
layer.

The Pooling Layer

If images are big in size, we would need to reduce the no.of trainable parameters.For this we
need to use pooling layers between convolution layers. Pooling is used for reducing the spatial
size of the image and is implemented independently on each depth dimension resulting in no
change in image depth. Max pooling is the most popular form of pooling layer.

32
3171105 AI

Here we have taken stride as 2, while


pooling size also as 2. The max
operation is applied to each depth
dimension of the convolved output. As
you can see, the 44 convolved output
has become 22 after the max pooling
operation. Let’s see how max pooling looks on a real image.

In the above image we


have taken a convoluted
image and applied max
pooling on it which
resulted in still retaining
the image information that
is a car but if we closely
observe the dimensions of
the image is reduced to
half which basically means we can reduce the parameters to a great number.

There are other forms of pooling like average pooling, L2 norm pooling.

Output dimensions

1. It is tricky at times to understand the input and output dimensions at the end of each
convolution layer. For this we will use three hyperparameters that would control the
size of output volume.
2. No of Filter: The depth of the output volume will be equal to the number of filters
applied.The depth of the activation map will be equal to the number of filters.
3. Stride – When we have a stride of one we move across and down a single pixel. With
higher stride values, we move a large number of pixels at a time and hence produce
smaller output volumes.

4. Zero padding – This helps us to preserve the size of the input image. If a single zero
padding is added, a single stride filter movement would retain the size of the original
image.
We can apply a simple formula to calculate the output dimensions.

The spatial size of the output image can be calculated as([W-F+2P]/S)+1. where, W is the input
volume size, F is the size of the filter, P is the number of padding applied, S is the number of
strides.

Let us take an example of an input image of size 64643, we apply 10 filters of size 333, with
single stride and no zero padding.

33
3171105 AI

Here W=64, F=3, P=0 and S=1. The output depth will be equal to the number of filters applied
i.e. 10.

The size of the output volume will be ([64-3+0]/1)+1 = 62. Therefore the output volume will
be 62x62x10.

The Output layer

● With no layers of convolution and padding, we need the output in the form of a class.
● To generate the final output we need to apply a fully connected layer to generate an
output equal to the number of classes we need.
● Convolution layers generate 3D activation maps while we just need the output as to
whether or not an image belongs to a particular class.
● The Output layer has a loss function like categorical cross-entropy, to compute the error
in prediction. Once the forward pass is complete the backpropagation begins to update
the weight and biases for error and loss reduction.

Summary:

● Pass an input image to the first convolutional layer. The convoluted output is obtained
as an activation map. The filters applied in the convolution layer extract relevant
features from the input image to pass further.
● Each filter shall give a different feature to aid the correct class prediction. In case we
need to retain the size of the image, we use the same padding(zero padding), otherwise
valid padding is used since it helps to reduce the number of features.
● Pooling layers are then added to further reduce the number of parameters
● Several convolution and pooling layers are added before the prediction is made.
Convolutional layers help in extracting features. As we go deeper in the network more
specific features are extracted as compared to a shallow network where the features
extracted are more generic.
● The output layer in a CNN as mentioned previously is a fully connected layer, where
the input from the other layers is flattened and sent so as to transform the output into
the number of classes as desired by the network.
● The output is then generated through the output layer and is compared to the output
layer for error generation. A loss function is defined in the fully connected output layer
to compute the mean square loss. The gradient of error is then calculated.
● The error is then back propagated to update the filter(weights) and bias values.
● One training cycle is completed in a single forward and backward pass.

Demonstration:

34
3171105 AI

Import necessary Library:

35
3171105 AI

Model Architecture:

Train the Model:

36
3171105 AI

Evaluate the Model:

37
3171105 AI

Predict the Label:

Conclusion:

38

You might also like