1.explain User Search Techniques
1.explain User Search Techniques
♻️ 3. Relevance Feedback
Enhances search by using user feedback.
Types:
o Explicit: User marks documents as relevant/non-relevant.
o Implicit: System assumes feedback based on user interaction (e.g., clicks).
System modifies the original query by:
o Increasing weights of terms in relevant docs.
o Decreasing weights of terms in non-relevant docs.
Common method: Rocchio’s algorithm for query refinement.
🎨 7. Information Visualization
Supports search by displaying data in graphical or interactive forms.
Aims to help users understand complex information quickly.
Based on cognitive psychology and visual perception principles.
Tools include:
o Graphs, charts, network diagrams, maps, and timelines.
Useful in exploring large search results or patterns within data.
Term clustering is a technique used in Information Retrieval to group similar terms based on
their co-occurrence in documents. It helps in expanding user queries with related terms,
improving search effectiveness.
✅ Purpose
🔍 Working Principle
Terms that appear frequently together in the same documents are considered to be
about the same concept.
A similarity measure (e.g., cosine similarity) is computed between term vectors
(frequency of terms in documents).
A matrix is created where each cell indicates similarity between two terms.
A threshold is applied: if the similarity score exceeds the threshold, the terms are
grouped.
📌 Clustering Techniques
📈 Applications
✅ Introduction
⚙️ Working Principle
KMP preprocesses the pattern to build a Longest Prefix Suffix (LPS) array.
It avoids redundant comparisons by reusing previously matched characters.
Time Complexity:
o Preprocessing (LPS array) – O(m)
o Search – O(n), where n is text length and m is pattern length.
1. Preprocess Pattern:
o Create the LPS array that stores the length of the longest prefix that is also a
suffix.
2. Search Phase:
o Scan the text using the pattern.
o Use the LPS array to skip unnecessary comparisons when a mismatch occurs.
🧠 Use in Information Retrieval Systems
Exact keyword search: KMP can locate exact phrases in large documents.
Document scanning: Fast scanning of large corpora for query patterns.
Efficient indexing: Helps in pattern-based document indexing.
Text processing tools: Integrated into search engines and text editors.
🧠 Advantages
🔴 Limitations
📚 Example
Pattern: "data"
Text: "big data and data science are emerging fields"
KMP quickly locates both occurrences of "data" without scanning the entire text redundantly.
✅ Goals in IR Systems
A user searches for "Data Security." The system displays a tree map with clusters like
"Encryption," "Access Control," and "Firewalls." Clicking a cluster shows documents and
terms ranked by relevance.
🧠 Benefit to Users
🔹 Indexing
Indexing is the process of organizing data or documents so that relevant information can be
retrieved efficiently.
An index is a searchable data structure that maps terms (keywords) to documents in which
they appear. It improves the speed and accuracy of information retrieval.
🔹 Automatic Indexing
Automatic Indexing is the computerized process of analyzing documents and extracting key
terms or features to build an index without human intervention.
✅ Advantages
✅ Key Difference
🧠 Example
Text search refers to the process of finding specific words, patterns, or phrases within a
collection of text or documents. It is a core function in Information Retrieval Systems,
enabling users to locate relevant information by matching a query with stored data.
⚙️ How It Works
3. Boolean Search
4. Proximity Search
5. Fuzzy Search
✅ Conclusion
Text search enables users to efficiently retrieve information. It can be implemented through
different strategies, ranging from simple string matching to advanced pattern recognition
using either software or hardware.