0% found this document useful (0 votes)

11 views23 pages

Index: SR. NO. Practical Name Date of Perform NO. Sign

The document outlines a series of practical exercises related to Information Retrieval (IR) systems, covering topics such as document indexing, retrieval models, spelling correction, evaluation metrics, text categorization, clustering, web crawling, and link analysis. Each practical includes aims, programming tasks, and sample code implementations. The exercises are designed for students at Satish Pradhan Dnyansadhna College for the academic year 2025-2026.

Uploaded by

rajsawant03042005

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views23 pages

Index: SR. NO. Practical Name Date of Perform NO. Sign

Uploaded by

rajsawant03042005

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

INDEX

SR. DATE OF PAGE

PRACTICAL NAME SIGN.
NO. PERFORM NO.
1. Document Indexing and Retrieval
 Implement an inverted index construction
algorithm. 11-12-2024 3-4
 Build a simple document retrieval system
using the constructed index.
2. Retrieval Models
 Implement the Boolean retrieval model and
process queries. 11-12-2024 5-8
 Implement the vector space model with
TF-IDF weighting and cosine similarity.
3. Spelling Correction in IR Systems
 Develop a spelling correction module
using edit distance algorithms. 08-01-2025 9-10
 Integrate the spelling correction module
into an information retrieval system.
4. Evaluation Metrics for IR Systems
 Calculate precision, recall, and F-measure
for a given set of retrieval results.
15-01-2025 11-12
 Use an evaluation toolkit to measure
average precision and other evaluation
metrics.
5. Text Categorization
 Implement a text classification algorithm
(e.g., Naive Bayes or Support Vector
Machines).
22-01-2025 13-14
 Train the classifier on a labelled dataset
and evaluate its performance.
6. Clustering for Information Retrieval
 Implement a clustering algorithm (e.g., K- 05-02-2025 15-16
means or hierarchical clustering).
 Apply the clustering algorithm to a set of
documents and evaluate the clustering
results.
7. Web Crawling and Indexing
 Develop a web crawler to fetch and index
web pages. 05-02-2025 17-18
 Handle challenges such as robots.txt,
dynamic content, and crawling delays.
8. Link Analysis and PageRank
 Implement the PageRank algorithm to rank
web pages based on link analysis. 12-02-2025 19-20
 Apply the PageRank algorithm to a small
web graph and analyze the results.
9. Learning to Rank
 Implement a learning to rank algorithm
(e.g., RankSVM or RankBoost). 19-02-2025 21-22
 Train the ranking model using labelled data
and evaluate its effectiveness.
10. Advanced Topics in Information Retrieval
 Implement a text summarization algorithm
(e.g., extractive or abstractive). 26-02-2025 23-24
 Build a question-answering system using
techniques such as information extraction.

Satish Pradhan Dnyansadhna College (2025-2026) 2

Practical No. 1
Aim: Document Indexing and Retrieval
 Implement an inverted index construction algorithm.
 Build a simple document retrieval system using the constructed index.

Program:-
# Define the documents
document1 = "The quick brown fox jumped over the lazy dog."
document2 = "The lazy dog slept in the sun."

# Step 1: Tokenize the documents

# Convert each document to lowercase and split it into words
tokens1 = document1.lower().split()
tokens2 = document2.lower().split()
# Combine the tokens into a list of unique terms
terms = list(set(tokens1 + tokens2))

# Step 2: Build the inverted index

# Create an empty dictionary to store the inverted index
inverted_index = {}
# For each term, find the documents that contain it
for term in terms:
documents = []
if term in tokens1:
documents.append("Document 1")
if term in tokens2:
documents.append("Document 2")
inverted_index[term] = documents

# Step 3: Print the inverted index

for term, documents in inverted_index.items():
print(term, "->", ",".join(documents))

Output:

Satish Pradhan Dnyansadhna College (2025-2026) 3

Satish Pradhan Dnyansadhna College (2025-2026) 4
Practical No. 2
Aim: Retrieval Models
 Implement the Boolean retrieval model and process queries.
 Implement the vector space model with TF-IDF weighting and cosine
similarity.

Program:
import pandas
from contextlib import redirect_stdout
terms = []
keys = []
vec_Dic = {}
dicti = {}
dummy_List = []
# list for performing some operations and clearing them
def filter(documents, rows, cols):
for i in range(rows):
for j in range(cols):
if(j == 0):
# first column has the name of the document in the csv file
keys.append(documents.loc[i].iat[j])
else:
dummy_List.append(documents.loc[i].iat[j])
# dummy list to update the terms in the dictionary
if documents.loc[i].iat[j] not in terms:
# add the terms to the list if it is not present else continue
terms.append(documents.loc[i].iat[j])
copy = dummy_List.copy()
dicti.update({documents.loc[i].iat[0]: copy})
# adding the key value pair to a dictionary
dummy_List.clear()
# clearing the dummy list
def bool_Representation(dicti, rows, cols):
terms.sort()
for i in (dicti):
for j in terms:
# if the string is present in the list we append 1 else we append 0
if j in dicti[i]:
dummy_List.append(1)
else:
dummy_List.append(0)
# appending 1 or 0 for obtaining the boolean representation
copy = dummy_List.copy()
# copying the dummy list to a different list
vec_Dic.update({i: copy})

Satish Pradhan Dnyansadhna College (2025-2026) 5

# adding the key value pair to a dictionary
dummy_List.clear()
# clearing the dummy list
def query_Vector(query):
'''In this function we represent the query in the form of boolean vector'''
qvect = []
# query vector which is returned at the end of the function
for i in terms:
if i in query:
qvect.append(1)
else:
qvect.append(0)
return qvect
# return the query vector which is obtained in the boolean form
def prediction(q_Vect):
'''In this function we make the prediction regarding which document is related
to the given query by performing the boolean operations'''
dictionary = {}
listi = []
count = 0
term_Len = len(terms)
# number of terms present in the term list
for i in vec_Dic:
for t in range(term_Len):
if(q_Vect[t] == vec_Dic[i][t]):
count += 1
dictionary.update({i: count})
count = 0
# reinitialisaion of count variable to 0
for i in dictionary:
listi.append(dictionary[i])
# here we append the count value to list
listi = sorted(listi, reverse=True)
ans = '
with open('output.txt', 'w') as f:
with redirect_stdout(f):
print("ranking of the documents")
for count, i in enumerate(listi):
key = check(dictionary, i)
# Function call to get the key when the value is known
if count == 0:
ans = key
# to store the name of the document which is most relevant
print(key, "rank is", count+1)
dictionary.pop(key)

Satish Pradhan Dnyansadhna College (2025-2026) 6

print(ans, "is the most relevant document for the given query")
# to print the name of the document which is most relevant
def check(dictionary, val):
'''Function to return the key when the value is known'''
for key, value in dictionary.items():
if(val == value):
return key
def main():
documents = pandas.read_csv(r'D:\deore\documents.csv')
# to read the data from the csv file as a dataframe
rows = len(documents)
# to get the number of rows
cols = len(documents.columns)
# to get the number of columns
filter(documents, rows, cols)
bool_Representation(dicti, rows, cols)
print("Enter query")
query = input()
query = query.split(' ')
# splitting the query as a list of strings
q_Vect = query_Vector(query)
# function call to represent the query in the form of boolean vector
prediction(q_Vect)
main()

Output:

Satish Pradhan Dnyansadhna College (2025-2026) 7

Satish Pradhan Dnyansadhna College (2025-2026) 8
Practical no: 3
Aim Spelling Correction in IR Systems
 Develop a spelling correction module using edit distance algorithms.
 Integrate the spelling correction module into an information retrieval
system.
Program:
# Importing necessary libraries
import nltk
from nltk.metrics.distance import edit_distance
from nltk.corpus import words

# Downloading and importing the 'words' corpus

nltk.download('words')

# List of correct words from the 'words' corpus

correct_words = words.words()

# List of incorrect spellings that need to be corrected

incorrect_words = ['happpy', 'azmaing', 'intelliengt', 'natuer', 'ashy']

# Printing the incorrect words

print("Incorrect Words:", incorrect_words)
print("========= Result =========")

# Loop to find correct spellings based on edit distance

for word in incorrect_words:
# Calculate the edit distance between the word and all correct words
temp = [(edit_distance(word, w), w) for w in correct_words]
# Print the closest correct word (sorted by minimum edit distance)
print(f"Incorrect word: {word} => Corrected word: {sorted(temp, key=lambda val:
val[0])[0][1]}")

Output:

Satish Pradhan Dnyansadhna College (2025-2026) 9

Satish Pradhan Dnyansadhna College (2025-2026) 10
Practical No. 4
Aim: Evaluation Metrics for IR Systems
 Calculate precision, recall, and F-measure for a given set of retrieval
results.
 Use an evaluation toolkit to measure average precision and other
evaluation metrics.
Program 1:
from sklearn.metrics import precision_score

# define actual
act_pos = [1 for _ in range(100)]
act_neg = [0 for _ in range(10000)]
y_true = act_pos + act_neg

# define predictions
pred_pos = [0 for _ in range(10)] + [1 for _ in range(90)] # 90 positive predictions, 10 negative
predictions
pred_neg = [1 for _ in range(30)] + [0 for _ in range(9970)] # 30 positive predictions, 9970
negative predictions
y_pred = pred_pos + pred_neg

# calculate precision
precision = precision_score(y_true, y_pred, average='binary')
print('Precision: %.3f' % precision)

Output:

Recall
from sklearn.metrics import recall_score

# define actual
act_pos = [1 for _ in range(100)]
act_neg = [0 for _ in range(10000)]
y_true = act_pos + act_neg

# define predictions
pred_pos = [0 for _ in range(10)] + [1 for _ in range(90)]
pred_neg = [0 for _ in range(10000)]
y_pred = pred_pos + pred_neg

Satish Pradhan Dnyansadhna College (2025-2026) 11

# calculate precision
recall = recall_score(y_true, y_pred, average='binary')
print('Recall: %.3f' % recall)

F-measure
from sklearn.metrics import f1_score

# define actual labels

act_pos = [1 for _ in range(100)] # 100 positive instances
act_neg = [0 for _ in range(10000)] # 10000 negative instances
y_true = act_pos + act_neg # combine actual positive and negative labels

# define predictions
pred_pos = [0 for _ in range(5)] + [1 for _ in range(95)] # 95 positive predictions, 5 negative
predictions
pred_neg = [1 for _ in range(55)] + [0 for _ in range(9945)] # 55 positive predictions, 9945
negative predictions
y_pred = pred_pos + pred_neg # combine predicted positive and negative labels

# calculate F1 score
score = f1_score(y_true, y_pred, average='binary')

# print the F1 score (F-measure)

print('F-Measure: %.3f' % score)

Output:

Satish Pradhan Dnyansadhna College (2025-2026) 12

Practical No. 5
Aim: Text Categorization
Implement a text classification algorithm (e.g., Naive Bayes or Support
Vector Machines).
 Train the classifier on a labelled dataset and evaluate its performance.
Program :
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score, classification_report

# Sample dataset
texts = [
"I love this movie, it's amazing!",
"What a great day, I feel so happy!",
"This is the worst product I've ever bought.",
"I'm so sad and disappointed.",
"Absolutely fantastic experience, will buy again!",
"Terrible customer service, very unhappy.",
"I'm excited about this new opportunity!",
"The food was awful and I got sick."
]

# Corresponding labels (1 for positive, 0 for negative)

labels = [1, 1, 0, 0, 1, 0, 1, 0]

# Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(
texts, labels, test_size=0.25, random_state=42
)

# Convert text data into feature vectors

vectorizer = CountVectorizer()

X_train_vectors = vectorizer.fit_transform(X_train)
X_test_vectors = vectorizer.transform(X_test)

# Train a Naive Bayes classifier

classifier = MultinomialNB()
classifier.fit(X_train_vectors, y_train)

# Make predictions
y_pred = classifier.predict(X_test_vectors)

Satish Pradhan Dnyansadhna College (2025-2026) 13

# Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
report = classification_report(y_test, y_pred)

print(f"Accuracy: {accuracy:.2f}")
print("Classification Report:")
print(report)

Output:

Satish Pradhan Dnyansadhna College (2025-2026) 14

Practical no: 6
Aim: Clustering for Information Retrieval
 Implement a clustering algorithm (e.g., K-means or hierarchical
clustering).
 Apply the clustering algorithm to a set of documents and evaluate the
clustering results.
Program:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
dataset = pd.read_csv("D:/Amruta/Mall_Customers.csv")
print(dataset)
x = dataset.iloc[:, [3, 4]].values
print(x)
from sklearn.cluster import KMeans
wcss_list = []
for i in range(1, 11):
kmeans = KMeans(n_clusters=i, init='k-means++', random_state=42)
kmeans.fit(x)
wcss_list.append(kmeans.inertia_)
plt.plot(range(1, 11), wcss_list)
plt.title("The Elbow Method Graph")
plt.xlabel("Number of clusters (k)")
plt.ylabel("WCSS (Within-Cluster Sum of Squares)")
plt.show()

Output:

Satish Pradhan Dnyansadhna College (2025-2026) 15

Satish Pradhan Dnyansadhna College (2025-2026) 16
Practical No. 7
Aim: Web Crawling and Indexing
 Develop a web crawler to fetch and index web pages.
 Handle challenges such as robots.txt, dynamic content, and crawling
delays.
Program:
import requests
from bs4 import BeautifulSoup
from urllib.parse import urljoin

# Simple Web Crawler

class SimpleWebCrawler:
def init (self, base_url, max_pages=10):
self.base_url = base_url
self.max_pages = max_pages
self.visited = set()
def crawl(self, url, depth=0):
if url in self.visited or depth >= self.max_pages:
return
try:
response = requests.get(url)
if response.status_code != 200:
return

print(f"Crawling: {url}")
self.visited.add(url)

soup = BeautifulSoup(response.text, 'html.parser')

for link in soup.find_all('a', href=True):
next_url = urljoin(url, link['href'])
if next_url.startswith(self.base_url):
self.crawl(next_url, depth + 1)

except Exception as e:
print(f"Failed to crawl {url}: {e}")

# Usage example
if name == " main ":
start_url = "https://www.tpointtech.com/"
crawler = SimpleWebCrawler(base_url=start_url, max_pages=5)
crawler.crawl(start_url)

Satish Pradhan Dnyansadhna College (2025-2026) 17

Output:

Satish Pradhan Dnyansadhna College (2025-2026) 18

Practical no 8
Aim: - Link Analysis and PageRank
 Implement the PageRank algorithm to rank web pages based on link
analysis.
 Apply the PageRank algorithm to a small web graph and analyze the
results.

Program: -
import numpy as np
import networkx as nx
def pagerank(graph, alpha=0.85, tol=1.0e-6, max_iter=100):
"""Computes PageRank scores for a directed graph."""
n = len(graph)
if n == 0:
return {}

# Initialize ranks
ranks = np.ones(n) / n

# Create transition matrix M

M = np.zeros((n, n))

# Construct the transition matrix

for i, node in enumerate(graph):
links = graph[node]
if links:
M[i, [list(graph.keys()).index(dest) for dest in links]] = 1 / len(links)
else:
M[i, :] = 1 / n # Distribute rank to all pages (handling dangling nodes)

# Power iteration method

for _ in range(max_iter):
new_ranks = alpha * np.dot(M.T, ranks) + (1 - alpha) / n * np.ones(n)

# Check for convergence (if the change in ranks is less than tolerance)
if np.linalg.norm(new_ranks - ranks, ord=1) < tol:
break

ranks = new_ranks

return {node: ranks[i] for i, node in enumerate(graph)}

if name == " main ":

Satish Pradhan Dnyansadhna College (2025-2026) 19

# Example web graph
web_graph = {
"A": ["B", "C"],
"B": ["C", "D"],
"C": ["A"],
"D": ["C"]
}

# Compute PageRank
ranks = pagerank(web_graph)

# Print PageRank Scores

print("PageRank Scores:")
for page, rank in ranks.items():
print(f"{page}: {rank:.4f}")
Output: -

Satish Pradhan Dnyansadhna College (2025-2026) 20

Practical no 9
Aim: - Learning to Rank
 Implement a learning to rank algorithm (e.g., RankSVM or
RankBoost).
 Train the ranking model using labelled data and evaluate its
effectiveness.
Program: -
from gensim.summarization import summarize
def extractive_summary(text, ratio=0.2):
"""
Generate an extractive summary of the given text using Gensim's summarize function.
Parameters:
- text (str): The input text to summarize.
- ratio (float): The ratio of sentences to include in the summary (default is 0.2).
Returns:
- str: The extractive summary.
"""
try:
summary = summarize(text, ratio=ratio)
return summary
except ValueError:
return "Input text is too short to summarize."
# Example usage:
text = """
Artificial intelligence (AI) is intelligence demonstrated by machines, in contrast to the natural
intelligence displayed by humans. Leading AI textbooks define the
field as the study of "intelligent agents": any device that perceives its environment and takes
actions to maximize its chance of success at some goal. Colloquially, the term "artificial
intelligence" is applied when a machine mimics "cognitive" functions that humans associate
with the human mind, such as "learning" and "problem-solving". As machines become
increasingly capable, tasks considered to require "intelligence" are often removed from the
definition of AI. A quip in Tesler's Theorem says "AI is whatever hasn't been done yet."
For instance, optical character recognition is frequently excluded from things considered to
be AI, having become a routine technology.
Modern machine capabilities generally classified as AI include successfully understanding
human speech, competing at the highest level in strategic game systems (such as chess and
Go), self-driving cars, and more.
"""
print("Extractive Summary:")
print(extractive_summary(text))

Satish Pradhan Dnyansadhna College (2025-2026) 21

Output:

Satish Pradhan Dnyansadhna College (2025-2026) 22

Practical no: 10
Aim: - Advanced Topics in Information Retrieval
 Implement a text summarization algorithm (e.g., extractive or
abstractive).
 Build a question-answering system using techniques such as
information extraction
Program: -
from transformers import pipeline
def abstractive_summary(text, min_length=30, max_length=130):
"""
Generate an abstractive summary of the given text using Hugging Face's transformers
pipeline.
Parameters:
- text (str): The input text to summarize.
- min_length (int): Minimum length of the summary.
- max_length (int): Maximum length of the summary.
Returns:
- str: The abstractive summary.
"""
summarizer = pipeline("summarization")
summary = summarizer(text, min_length=min_length, max_length=max_length)
return summary[0]['summary_text']
# Example usage:
print("Abstractive Summary:")
print(abstractive_summary(text))

Output:

Satish Pradhan Dnyansadhna College (2025-2026) 23

AI Practical TYCS
No ratings yet
AI Practical TYCS
31 pages
IRS Theory & Lab Syllabus
100% (1)
IRS Theory & Lab Syllabus
3 pages
AL-405 Machine Learning Lab Manual
No ratings yet
AL-405 Machine Learning Lab Manual
40 pages
IR Practical
No ratings yet
IR Practical
24 pages
IR Journal
No ratings yet
IR Journal
36 pages
AIML Lab Manual
67% (3)
AIML Lab Manual
31 pages
CLASS XI AI PRACTICAL LIST 2024 (Till Mid Term)
No ratings yet
CLASS XI AI PRACTICAL LIST 2024 (Till Mid Term)
3 pages
Certificate: T.Y.Bsc Cs
No ratings yet
Certificate: T.Y.Bsc Cs
120 pages
Irs 122010304057 PDF
No ratings yet
Irs 122010304057 PDF
23 pages
Introduction To Nursing Informatics
100% (3)
Introduction To Nursing Informatics
55 pages
DSBDA Lab Manual
No ratings yet
DSBDA Lab Manual
167 pages
Ir Practical Manual 2
No ratings yet
Ir Practical Manual 2
24 pages
IR Journal
No ratings yet
IR Journal
20 pages
153 Sanskriti IR File
No ratings yet
153 Sanskriti IR File
55 pages
Information Retrival
No ratings yet
Information Retrival
43 pages
Ir Journal
No ratings yet
Ir Journal
41 pages
ML Contenthalf
No ratings yet
ML Contenthalf
35 pages
IR Journal (Printable)
No ratings yet
IR Journal (Printable)
20 pages
IR
No ratings yet
IR
12 pages
Machine Learning Lab Manual (15CSL76)
No ratings yet
Machine Learning Lab Manual (15CSL76)
30 pages
1 Overview
No ratings yet
1 Overview
44 pages
IR - 754 All Practical
No ratings yet
IR - 754 All Practical
21 pages
Python Final I32 Merged
No ratings yet
Python Final I32 Merged
45 pages
Disha (Ir Pract)
No ratings yet
Disha (Ir Pract)
13 pages
IR Practical Theory
No ratings yet
IR Practical Theory
9 pages
Supervisionguide16 17 Students
No ratings yet
Supervisionguide16 17 Students
17 pages
Ir Mod2 Notes
No ratings yet
Ir Mod2 Notes
26 pages
Numpy Module
No ratings yet
Numpy Module
10 pages
Vanessaa Wim
No ratings yet
Vanessaa Wim
9 pages
ML Record - Unlocked
No ratings yet
ML Record - Unlocked
67 pages
DSBDAlab Manual
No ratings yet
DSBDAlab Manual
116 pages
Vector Model-21PW41
No ratings yet
Vector Model-21PW41
5 pages
Project Report
No ratings yet
Project Report
5 pages
18AIL66-Machine Learning Lab Manual
No ratings yet
18AIL66-Machine Learning Lab Manual
29 pages
Supervisionguide15 16 Students
No ratings yet
Supervisionguide15 16 Students
18 pages
Allnlp
No ratings yet
Allnlp
15 pages
asila-IR
No ratings yet
asila-IR
16 pages
BTCS 1st IMLP - Assignment 3 - 4 - 5
No ratings yet
BTCS 1st IMLP - Assignment 3 - 4 - 5
3 pages
SlipSolutions1st MCA
No ratings yet
SlipSolutions1st MCA
32 pages
ML - LAB Record - Final
No ratings yet
ML - LAB Record - Final
39 pages
IR Prac 2
No ratings yet
IR Prac 2
4 pages
Shwet Mlds
No ratings yet
Shwet Mlds
35 pages
AI and ML Lab Manual 2022
No ratings yet
AI and ML Lab Manual 2022
37 pages
COURSEWORK1 Details
No ratings yet
COURSEWORK1 Details
3 pages
DMlab 2021
No ratings yet
DMlab 2021
4 pages
Python Laboratory Manual (Single Sided)
No ratings yet
Python Laboratory Manual (Single Sided)
41 pages
IR Practical 1
No ratings yet
IR Practical 1
5 pages
Ai Journal
No ratings yet
Ai Journal
33 pages
Assignment 1
No ratings yet
Assignment 1
2 pages
Lab Manual Ai
No ratings yet
Lab Manual Ai
28 pages
Theory Assignment
No ratings yet
Theory Assignment
4 pages
ME P4252-II Semester - MACHINE LEARNING
100% (1)
ME P4252-II Semester - MACHINE LEARNING
48 pages
Artificial Intelligence & Machine Learning: Practical Training Report
No ratings yet
Artificial Intelligence & Machine Learning: Practical Training Report
15 pages
NLP Soc
No ratings yet
NLP Soc
15 pages
Artificial Intelligence and Machine Learning
No ratings yet
Artificial Intelligence and Machine Learning
4 pages
Ai& ML Lab TLP (Ai A)
No ratings yet
Ai& ML Lab TLP (Ai A)
5 pages
Ontological Engineering
No ratings yet
Ontological Engineering
17 pages
IR Assignment3
No ratings yet
IR Assignment3
3 pages
Information Retreival Assignment
No ratings yet
Information Retreival Assignment
4 pages
What Is Supervised Machine Learning
No ratings yet
What Is Supervised Machine Learning
3 pages
Resume Ashiq
No ratings yet
Resume Ashiq
2 pages
PDF of Practicle From 2 To 4
No ratings yet
PDF of Practicle From 2 To 4
12 pages
Final Sol
100% (1)
Final Sol
8 pages
Chapter 01 Machine Learning
No ratings yet
Chapter 01 Machine Learning
22 pages
Data Science Ai Important Questions Answers - 250322 - 101649
No ratings yet
Data Science Ai Important Questions Answers - 250322 - 101649
31 pages
Data Serialization in Big Data
No ratings yet
Data Serialization in Big Data
3 pages
Digital Library Proposal
No ratings yet
Digital Library Proposal
7 pages
Google Tulip Documentation
56% (9)
Google Tulip Documentation
15 pages
R19 DBMS Material
No ratings yet
R19 DBMS Material
207 pages
DIP Lab 13 DBSCAN Clustering
No ratings yet
DIP Lab 13 DBSCAN Clustering
6 pages
Web Mining: G.Anuradha References From Dunham
100% (1)
Web Mining: G.Anuradha References From Dunham
63 pages
Neuro Quantology
No ratings yet
Neuro Quantology
13 pages
Attributes &entities
No ratings yet
Attributes &entities
15 pages
Scorecard Azure
No ratings yet
Scorecard Azure
1 page
OCI - Data - Flow - Spark - Streaming - and - Machine - Learning
No ratings yet
OCI - Data - Flow - Spark - Streaming - and - Machine - Learning
23 pages
Development of A Punjabi To English Tran
No ratings yet
Development of A Punjabi To English Tran
6 pages
6 Expert Systems
No ratings yet
6 Expert Systems
18 pages
Micron Partner Page
No ratings yet
Micron Partner Page
2 pages
History of GIS
No ratings yet
History of GIS
2 pages
Vet. Assignment
No ratings yet
Vet. Assignment
16 pages
ChatGPT Criminal Liability
No ratings yet
ChatGPT Criminal Liability
3 pages
Vacancy Notice - INT04590
No ratings yet
Vacancy Notice - INT04590
5 pages
MCQ Day1 DBMS - 13
No ratings yet
MCQ Day1 DBMS - 13
12 pages
A Novel Based Translation Model From English To Telugu
No ratings yet
A Novel Based Translation Model From English To Telugu
4 pages
Fyp Ideas
No ratings yet
Fyp Ideas
3 pages
Bookont: A Comprehensive Book Structural Ontology For Book Search and Retrieval
No ratings yet
Bookont: A Comprehensive Book Structural Ontology For Book Search and Retrieval
6 pages
Specializations MTech Software Systems
No ratings yet
Specializations MTech Software Systems
5 pages
FRM Course Syllabus IPDownload
No ratings yet
FRM Course Syllabus IPDownload
2 pages
Comparison Python, R, SAS
No ratings yet
Comparison Python, R, SAS
6 pages
DBMS Lab Manual
From Everand
DBMS Lab Manual
Jitendra Patel
1.5/5 (3)

Index: SR. NO. Practical Name Date of Perform NO. Sign

Uploaded by

Index: SR. NO. Practical Name Date of Perform NO. Sign

Uploaded by

INDEX

SR. DATE OF PAGE

Satish Pradhan Dnyansadhna College (2025-2026) 2

# Step 1: Tokenize the documents

# Step 2: Build the inverted index

# Step 3: Print the inverted index

Satish Pradhan Dnyansadhna College (2025-2026) 3

Satish Pradhan Dnyansadhna College (2025-2026) 5

Satish Pradhan Dnyansadhna College (2025-2026) 6

Satish Pradhan Dnyansadhna College (2025-2026) 7

# Downloading and importing the 'words' corpus

# List of correct words from the 'words' corpus

# List of incorrect spellings that need to be corrected

# Printing the incorrect words

# Loop to find correct spellings based on edit distance

Satish Pradhan Dnyansadhna College (2025-2026) 9

Satish Pradhan Dnyansadhna College (2025-2026) 11

# define actual labels

# print the F1 score (F-measure)

Satish Pradhan Dnyansadhna College (2025-2026) 12

# Corresponding labels (1 for positive, 0 for negative)

# Split the data into training and testing sets

# Convert text data into feature vectors

# Train a Naive Bayes classifier

Satish Pradhan Dnyansadhna College (2025-2026) 13

Satish Pradhan Dnyansadhna College (2025-2026) 14

Satish Pradhan Dnyansadhna College (2025-2026) 15

# Simple Web Crawler

soup = BeautifulSoup(response.text, 'html.parser')

Satish Pradhan Dnyansadhna College (2025-2026) 17

Satish Pradhan Dnyansadhna College (2025-2026) 18

# Create transition matrix M

# Construct the transition matrix

# Power iteration method

return {node: ranks[i] for i, node in enumerate(graph)}

if name == " main ":

Satish Pradhan Dnyansadhna College (2025-2026) 19

# Print PageRank Scores

Satish Pradhan Dnyansadhna College (2025-2026) 20

Satish Pradhan Dnyansadhna College (2025-2026) 21

Satish Pradhan Dnyansadhna College (2025-2026) 22

Satish Pradhan Dnyansadhna College (2025-2026) 23

You might also like