0% found this document useful (0 votes)
18 views2 pages

Assignment 1

Uploaded by

Army
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views2 pages

Assignment 1

Uploaded by

Army
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

ASSIGNMENT-1, NOV-2024

Information Retrieval (CSE 4053)


Programme: B. Tech (CSE) Semester: 7th
Full Marks: 10 Date of Submission: 09-11-2024

Subject/Course Learning Outcome *Taxonomy Ques. Marks


Level Nos.
Outline the concepts and apply the basics of L2 1, 2 4
indexing and querying of an information retrieval
system.
Understand the data corpus used in information L2 3, 4, 5 6
retrieval systems.
Illustrate various components and experiment with
different compression techniques to compress the
index of dictionary and its postings lists.

Apply retrieval models to construct information


retrieval systems.
Understand the methods to enhance the retrieval
system through the use of techniques like relevance
feedback and query expansion.

Apply text clustering and classification techniques


for information retrieval.
*Bloom’s taxonomy levels: Knowledge (L1), Comprehension (L2), Application (L3), Analysis (L4),
Evaluation (L5), Creation (L6).
Answer all questions. Each question carries equal mark.
 Assignment scores/markings depend on neatness, clarity and date of submission.
 Write your answers with enough detail about your approach and concepts used, so that
the grader will be able to understand it easily.
 You are allowed to use only those concepts which are covered in the lecture class till
date.

1. Consider the following postings lists 2

A  [4,6,10,12,14,16,18,20,22,32,47,81,120,122,157,180]
B [4,18,121,180]
Work out how many comparisons would be done to intersect the two postings
lists with skip pointer and without skip pointer.
2. Construct the inverted index using the following documents. 2
Document 1: The quick brown fox jumped over the lazy dog.
Document 2: The lazy dog slept in the sun.
Document 3: The sun was bright.
Document 4: The bright sun was in the blue sky.
Document 5: The sky was blue.
3. For a given document stored in the data warehouse, compress the words by 2
applying following preprocessing technique separately.
i. Normalization
ii. Normalization and stemming
iii. Stop words removal
Information retrieval is the activity of obtaining
information resources relevant to an information need from
a collection of information resources. Searches can be
based on full text or other content based indexing.
Automated information retrieval systems are used to reduce
what has been called "information overload". Many
universities and public libraries use IR systems to provide
access to books, journals and other documents. Web search
engines are the most visible IR applications.

4. State and discuss the algorithmic steps to compute edit distance between two 2

strings. Use the Levenshtein matrix to represent the distance score between
"kitten" and "sitting".
5. Write the pseudocode for Jaccard Coefficient and calculate the Jaccard 2

Coefficient score between “carlo” and “carol” using bi-grams.

You might also like