UNIT 3 Notes
UNIT 3 Notes
Objectives:
Types:
o Lossless Compression:
o Lossy Compression:
Applications:
Modeling:
o Example: In English text, letters like ‘e’ and ‘t’ appear more frequently.
Coding:
Steps:
3. Huffman Coding
Working:
Advantages:
Disadvantages:
Applications:
o Text compression
Working:
Advantages:
Disadvantages:
o Computationally intensive.
Applications:
o JPEG 2000
Definition: Text compression methods that assign codes to individual symbols rather
than blocks of text.
Key Methods:
o Huffman Coding
o Arithmetic Coding
Importance:
Concept: Store gaps (difference between consecutive docIDs) instead of raw IDs.
Techniques:
Techniques:
Advantages:
Compression Effectiveness
Document Reordering
Techniques:
Benefits:
Definition: Index structures that can be efficiently updated with additions, deletions,
or modifications.
Techniques:
Advantages:
o Reduced downtime.
Advantages:
Disadvantages:
o Harder to update incrementally.
Advantages:
Disadvantages:
Advantages:
Garbage Collection
Advantages:
o Frees space.
Document Modifications
Concept: Modify a document by deleting old version and inserting new one.
Advantages:
Disadvantages: