0% found this document useful (0 votes)
36 views9 pages

Reference Collection

Uploaded by

Sindhu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views9 pages

Reference Collection

Uploaded by

Sindhu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Reference Collection

Name : Sindhu D R
USN : 1AJ22CY048
• In information retrieval, a reference collection is
a set of documents that serves as a standard or
benchmark dataset for evaluating and
comparing the effectiveness of information
retrieval systems.
• Purpose: The primary goal of a reference
collection is to provide users with access to
relevant information that can help answer
queries or support decision-making.

• Types of Resources: This can include:Text


documents (e.g., articles, reports)Media files
(e.g., images, videos)Raw data (e.g., datasets).
TREC(Text Retrieval Conference)
• TREC, or the Text REtrieval Conference, is an
ongoing series of workshops and evaluations that
began in 1992, organized by the U.S. National
Institute of Standards and Technology (NIST).

• TREC’s primary goal is to advance research in


information retrieval (IR) by providing standardized
datasets and evaluation methods, allowing
researchers to benchmark and compare the
effectiveness of various retrieval algorithms and
systems.
Evaluation Methods

• Precision and Recall: These basic metrics


measure accuracy by looking at how many
retrieved documents are relevant (precision) and
how many relevant documents were retrieved
(recall).
Example:
Collection : A small set of 5 documents:

• Doc 1 : "Artificial intelligence is revolutionizing technology.“

• Doc 2 : "The health benefits of meditation are numerous.“

• Doc 3 : "AI applications in healthcare are growing rapidly.“

• Doc 4 : "Sports events attract millions of viewers every year.“

• Doc 5 : "Recent advancements in AI have improved


diagnostics.”
• Query: "AI in healthcare“

• Relevant Documents : Doc3:"AI applications in healthcare are


growing rapidly.” Doc5:"Recent advancements in AI have
improved diagnostics.”

• System Response : The retrieval system returns the following


ranked list:Doc5,Doc1,Doc3,Doc2,Doc4

• Evaluation:
Precision at Rank 3 (top 3 documents):
Relevant documents in top 3=2 (Doc 5 and Doc 3)
Total documents in top 3=3
Precision = 2/3 ≈ 0.67 (or 67%)

Recall : Total relevant documents=2 (Doc 3 and Doc 5)


Relevant documents retrieved: 2 (Doc 3 and Doc 5)
Recall = 2/2 = 1.0 (or 100%)
Impact on Information Retrieval
Research
• Standardization: Providing standardized datasets
and evaluation methodologies that allow for
consistent comparisons across different systems.

• Community Building: Fostering collaboration and


communication among researchers, leading to
advancements in IR techniques and technologies.

• Benchmarking: Establishing benchmarks that help


researchers understand the strengths and
weaknesses of their systems in various retrieval
Thank You

You might also like