0% found this document useful (0 votes)

11 views46 pages

05 Compression

Uploaded by

Benjamin Bravo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views46 pages

05 Compression

Uploaded by

Benjamin Bravo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 46

CS 1501

Compression
The Compression "Problem"

Given a representation of some information, encode that

information using fewer bits than the original representation

2
Why compress?

● Can get more use out a disk of a given size

● Can get more use out of memory
○ E.g., free up memory by compressing inactive sections
■ Faster than paging
■ Built in to OSX Mavericks and later
● Can reduce the amount data transmitted
○ Faster file transfers
○ Cut power usage on mobile devices

3
Approaches to compression

● Can be grouped into two broad categories…

4
Lossy Compression

D Compress C Expand D’

● Information is permanently lost in the compression process

● Examples:
○ MP3, H264, JPEG
● With audio/video files this typically isn’t a huge problem as
human users might not be able to perceive the difference

5
Lossy examples

● MP3
○ “Cuts out” portions of audio that are considered beyond what
most people are capable of hearing
● JPEG

40K 28K

6
Lossless Compression

D Compress C Expand D

● Input can be recovered from compressed data exactly

● Examples:

○ zip files, FLAC

7
The Lossless Compression "Problem"

Given a representation of some information, encode the same

information exactly using fewer bits than the original
representation

8
Huffman Compression

● Works on arbitrary bit strings, but pretty easily explained

using characters
● Consider the ASCII character set
○ Essentially blocks of codes
■ In general, to fit R potential characters in a block, you
need lg R bits of storage per block
● Consequently, n bits storage blocks represent 2n characters
■ Each 8 bit code block represents one of 256 possible
characters in extended ASCII
■ Easy to encode/decode

9
Considerations for compressing ASCII

● What if we used variable length codewords instead of the

constant 8? Could we store the same info in less space?
○ Different characters are represented using codes of different
bit lengths
○ If all characters in the alphabet have the same usage
frequency, we can’t beat block storage
■ On a character by character basis…
○ What about different usage frequencies between characters?
■ In English, R, S, T, L, N, E are used much more than Q or X

10
Variable length encoding

● Decoding was easy for block codes

○ Grab the next 8 bits in the bitstring
○ How can we decode a bitstring that is made of of variable
length code words?
○ BAD example of variable length encoding:

1 A
00 T
01 K
001 U
100 R
101 C
10101 N

11
Variable length encoding for lossless compression

● Codes must be prefix free

○ No code can be a prefix of any other in the scheme

○ Using this, we can achieve compression by:

■ Using fewer bits to represent more common characters

■ Using longer codes to represent less common characters

12
How can we create these prefix-free codes?

Huffman encoding!

13
Generating Huffman codes

● Assume we have K characters that are used in the file to be

compressed and each has a weight (its frequency of use)
● Create a forest, F, of K single-node trees, one for each
character, with the single node storing that char’s weight
● while more than 1 tree in the F:
○ Select T1, T2 ∈ F that have the smallest weights in F
○ Create a new tree node N whose weight is the sum of T1 and
T2’s weights
○ Remove T1 and T2 from F
○ Add T1 and T2 as children (subtrees) of N
○ Add the new tree rooted by N to F
● Build a tree for "ABRACADABRA!"

14
ABRACADABRA!

Compressed bitstring:
12
1 010010101100 111001001010 1111

7
0 1

0 3
1

0
4 2
0 1 0 1

A B R C D !
5 2 2 1 1 1
15
Implementation concerns

● Need to efficiently be able to select lowest weight trees to

merge when constructing the trie
○ Can accomplish this using a priority queue
● Need to be able to read/write bitstrings!
○ Unless we pick multiples of 8 bits for our codewords, we will
need to read/write fractions of bytes for our codewords
■ We’re not actually going to do I/O on fraction of bytes
■ We’ll maintain a buffer of bytes and perform bit
processing on this buffer
■ See BinaryStdIn.java and BinaryStdOut.java

16
Binary I/O

private static void writeBit(boolean bit) {

// add bit to buffer
buffer <<= 1;
if (bit) buffer |= 1;
// if buffer is full (8 bits), write out as a single byte
N++;
if (N == 8) clearBuffer();
}

writeBit(true);
writeBit(false);
writeBit(true);
buffer: ????????
???????1
???????0
??????10
?????101
?????100
????1010
???10100
??101000
?1010000
10100001
10100000
writeBit(false);
writeBit(false);
writeBit(false); N: 1
2
3
4
5
6
7
0
8
writeBit(false);
writeBit(true);
17
Representing tries as bitstrings

18
Huffman pseudocode

● Encoding approach:
○ Read input
○ Compute frequencies
○ Build trie/codeword table
○ Write out trie as a bitstring to compressed file
○ Write out character count of input
○ Use table to write out the codeword for each input character
● Decoding approach:
○ Read trie
○ Read character count
○ Use trie to decode bitstring of compressed file
19
Further implementation concerns

● To encode/decode, we'll need to read in characters and

output codes/read in codes and output characters

○ …

○ Sounds like we'll need a symbol table!

■ What implementation would be best?

● Same for encoding and decoding?

○ Note that this means we need access to the trie to expand a

compressed file!

20
How do we determine character frequencies?

● Option 1: Preprocess the file to be compressed

○ Upside: Ensure that Huffman’s algorithm will produce the
best output for the given file
○ Downsides:
■ Requires two passes over the input, one to analyze
frequencies/build the trie/build the code lookup table, and
another to compress the file
■ Trie must be stored with the compressed file, reducing the
quality of the compression
● This especially hurts small files
● Generally, large files are more amenable to Huffman
compression
○ Just because a file is large, however, does not mean that
it will compress well!

21
How do we determine character frequencies?

● Option 2: Use a static trie

○ Analyze multiple sample files, build a single tree that will be
used for all compressions/expansions
○ Saves on trie storage overhead…
○ But in general not a very good approach
■ Different character frequency characteristics of different
files means that a code set/trie that works well for one file
could work very poorly for another
● Could even cause an increase in file size after
“compression”!

22
How do we determine character frequencies?

● Option 3: Adaptive Huffman coding

○ Single pass over the data to construct the codes and compress
a file with no background knowledge of the source
distribution
○ Not going to really focus on adaptive Huffman in the class, just
pointing out that it exists...

23
Ok, so how good is Huffman compression

● ASCII requires 8m bits to store m characters

● For a file containing c different characters
○ Given Huffman codes {h0, h1, h2, …, h(c-1)}
○ And frequencies {f0, f1, f2, …, f(c-1)}
○ Sum from 0 to c-1: len(hi) * fi
● Total storage depends on the differences in frequencies
○ The bigger the differences, the better the potential for
compression
● Huffman is optimal for character-by-character prefix-free
encodings
○ Proof in Propositions T and U of Section 5.5 of the text

24
That seems like a bit of a caveat...

● Where does Huffman fall short?

○ What about repeated patterns of multiple characters?
■ Consider a file containing:
● 1000 A’s
● 1000 B’s
● …
● 1000 of every ASCII character
■ Will this compress at all with Huffman encoding?
● Nope!
■ But it seems like it should be compressible...

25
Run length encoding

● Could represent the previously mentioned string as:

○ 1000A1000B1000C, etc.
■ Assuming we use 10 bits to represent the number of
repeats, and 8 bits to represent the character…
● 4608 bits needed to store run length encoded file
● vs. 2048000 bits for input file
● Huge savings!

● Note that this incredible compression performance is based

on a very specific scenario…
○ Run length encoding is not generally effective for most files, as
they often lack long runs of repeated characters
26
What else can we do to compress files?

27
Patterns are compressible, need a general approach

● Huffman used variable-length codewords to represent

fixed-length portions of the input…
○ Let’s try another approach that uses fixed-length codewords to
represent variable-length portions of the input
● Idea: the more characters can be represented in a single
codeword, the better the compression
○ Consider "the": 24 bits in ASCII
○ Representing "the" with a single 12 bit codeword cuts the used
space in half
■ Similarly, representing longer strings with a 12 bit
codeword would mean even better savings!

28
How do we know that “the” will be in our file?

● Need to avoid the same problems as the use of a static trie

for Huffman encoding…

● So use an adaptive algorithm and build up our patterns and

codewords as we go through the file

29
LZW compression

● Initialize codebook to all single characters

○ e.g., character maps to its ASCII value
● While !EOF:
○ Match longest prefix in codebook
○ Output codeword
○ Take this longest prefix, add the next character in the file, and
add the result to the dictionary with a new codeword

30
LZW compression example

● Compress, using 12 bit codewords:

○ TOBEORNOTTOBEORTOBEORNOT

Cur Output Add T 84 TT:265

T 84 TO:257 TO 257 TOB:266
O 79 OB:258 BE 259 BEO:267
B 66 BE:259 OR 261 ORT:268
E 69 EO:260 TOB 266 TOBE:269
O 79 OR:261 EO 260 EOR:270
R 82 RN:262 RN 262 RNO:271
N 78 NO:263 OT 264 --
O 79 OT:264 256 -- 31
LZW expansion

● Initialize codebook to all single characters

○ e.g., ASCII value maps to its character
● While !EOF:
○ Read next codeword from file
○ Lookup corresponding pattern in the codebook
○ Output that pattern
○ Add the previous pattern + the first character of the current
pattern to the codebook

Note this means no

codebook addition after
first pattern output!
32
LZW expansion example

Cur Output Add 84 T 264:OT

84 T -- 257 TO 265:TT
79 O 257:TO 259 BE 266:TOB
66 B 258:OB 261 OR 267:BEO
69 E 259:BE 266 TOB 268:ORT
79 O 260:EO 260 EO 269:TOBE
82 R 261:OR 262 RN 270:EOR
78 N 262:RN 264 OT 271:RNO
79 O 263:NO 256 --

33
How does this work out?

● Both compression and expansion construct the same

codebook!

○ Compression stores character string → codeword

○ Expansion stores codeword → character string

○ They contain the same pairs in the same order

■ Hence, the codebook doesn’t need to be stored with the

compressed file, saving space

34
Just one tiny little issue to sort out...

● Expansion's codebook will always be a step "behind"

compression's when processing the same pattern
○ If, during compression, the (pattern, codeword) that was just
added to the dictionary is immediately used in the next step,
the decompression algorithm will not yet know the codeword.
○ This can be easily detected and dealt with, however

35
LZW corner case example

● Compress, using 12 bit codewords: AAAAAA

Cur Output Add

A 65 AA:257
AA 257 AAA:258
AAA 258 --

● Expansion:

Cur Output Add

65 A --
257 AA 257:AA
258 AAA 258:AAA
36
LZW implementation concerns: codebook

● How to represent/store during:

○ Compression
○ Expansion
● Considerations:
○ What operations are needed?
○ How many of these operations are going to be performed?
● Discuss

37
Further implementation issues: codeword size

● How long should codewords be?

○ Use fewer bits:
■ Gives better compression earlier on
■ But, leaves fewer codewords available, which will hamper
compression later on
○ Use more bits:
■ Delays actual compression until longer patterns are found
due to large codeword size
■ More codewords available means that greater
compression gains can be made later on in the process

38
Variable width codewords

● This sounds eerily like variable length codewords…

○ Exactly what we set out to avoid!
● Here, we’re talking about a different technique
● Example:
○ Start out using 9 bit codewords
○ When codeword 512 is inserted into the codebook, switch to
outputting/grabbing 10 bit codewords
○ When codeword 1024 is inserted into the codebook, switch to
outputting/grabbing 11 bit codewords…
○ Etc.

39
Even further implementation issues: codebook size

● What happens when we run out of codewords?

○ Only 2n possible codewords for n bit codes
○ Even using variable width codewords, they can’t grow
arbitrarily large…
● Two primary options:
○ Stop adding new keywords, use the codebook as it stands
■ Maintains long already established patterns
■ But if the file changes, it will not be compressed as
effectively
○ Throw out the codebook and start over from single characters
■ Allows new patterns to be compressed
■ Until new patterns are built up, though, compression will
be minimal

40
The showdown you’ve all been waiting for...

HUFFMAN vs LZW

● In general, LZW will give better compression

○ Also better for compression archived directories of files
■ Why?
● Very long patterns can be built up, leading to better
compression
● Different files don’t “hurt” each other as they did in Huffman
○ Remember our thoughts on using static tries?

41
So lossless compression apps use LZW?

● Most dedicated compression applications use other algorithms:

○ DEFLATE (combination of LZ77 and Huffman)
■ Used by PKZIP and gzip
○ Burrows-Wheeler transforms
■ Used by bzip2
○ LZMA
■ Used by 7-zip
○ brotli
■ Published by Google in 2015
○ Zstandard (zstd)
■ Published by Facebook in 2016
■ Designed to provide DEFLATE-like compression ratios with
faster expansion runtimes
■ At its maximum compression level gives a compression ratio
close to lzma, better than bzip2
■ Currently decompresses faster than any other algorithm while
having similar or better compression ratio
42
DEFLATE et al achieve even better general compression?

● How much can they compress a file?

● Better question:
○ How much can a file be compressed by any algorithm?
● No algorithm can compress every bitstream
○ Assume we have such an algorithm
○ We can use to compress its own output!
○ And we could keep compressing its output until our
compressed file is 0 bits!
■ Clearly this can’t work
● Proofs in Proposition S of Section 5.5 of the text

43
Can we reason about how much a file can be compressed?

● Yes! Using Shannon Entropy

44
Information theory in a single slide...

● Founded by Claude Shannon in his paper "A Mathematical

Theory of Communication"
● Entropy is a key measure in information theory
○ Slightly different from thermodynamic entropy
○ A measure of the unpredictability of information content
○ By losslessly compressing data, we represent the same
information in less space
○ Hence, 8 bits of uncompressed text has less entropy than 8
bits of compressed data

45
Entropy applied to language:

● Translating a language into binary, the entropy is the

average number of bits required to store a letter of the
language
● Entropy of a message * length of message = amount of
information contained in that message
● On average, a lossless compression scheme cannot
compress a message to have more than 1 bit of information
per bit of compressed message
● Uncompressed, English has between 0.6 and 1.3 bits of
entropy per character of the message
46

Graph Theory - Important Application of Trees Huffman Coding
No ratings yet
Graph Theory - Important Application of Trees Huffman Coding
50 pages
Lecture 22 Compression
No ratings yet
Lecture 22 Compression
42 pages
L10 Huffman Encoding Greedy
No ratings yet
L10 Huffman Encoding Greedy
52 pages
5 Huffman Coding
No ratings yet
5 Huffman Coding
50 pages
2 2 5huffman
No ratings yet
2 2 5huffman
52 pages
Data Structures and Algorithms Compression Methods
No ratings yet
Data Structures and Algorithms Compression Methods
21 pages
Huffman Coding
No ratings yet
Huffman Coding
65 pages
7.file Compression
No ratings yet
7.file Compression
20 pages
Text and Text Compression
No ratings yet
Text and Text Compression
28 pages
Huffman Encoding Supplement
No ratings yet
Huffman Encoding Supplement
10 pages
Di-Huffman Trees
No ratings yet
Di-Huffman Trees
44 pages
Huffman Zipper
No ratings yet
Huffman Zipper
11 pages
5c. Huffman
No ratings yet
5c. Huffman
13 pages
Huffman
No ratings yet
Huffman
13 pages
Ajayroyal828@gmail - Com 9908104197
No ratings yet
Ajayroyal828@gmail - Com 9908104197
10 pages
Compression (Compatibility Mode)
No ratings yet
Compression (Compatibility Mode)
12 pages
Unit 1 Data Compression
No ratings yet
Unit 1 Data Compression
30 pages
210 Huffman Encoding
No ratings yet
210 Huffman Encoding
10 pages
Lecture# 08 Greedy Algorithms
No ratings yet
Lecture# 08 Greedy Algorithms
63 pages
Huffman Coding
No ratings yet
Huffman Coding
40 pages
Mmis G1 Ass
No ratings yet
Mmis G1 Ass
13 pages
11 Huffman Coding
No ratings yet
11 Huffman Coding
25 pages
Compression: Some Slides Courtesy James Allan@umass
No ratings yet
Compression: Some Slides Courtesy James Allan@umass
47 pages
Assignment 6: Huffman Encoding: Assignment Overview and Starter Files
No ratings yet
Assignment 6: Huffman Encoding: Assignment Overview and Starter Files
20 pages
Ultimedia OF ATA Ompression: IS502:M D I S
No ratings yet
Ultimedia OF ATA Ompression: IS502:M D I S
29 pages
Unit 2 CA209
No ratings yet
Unit 2 CA209
29 pages
Huffman Coding: Brief Theory
No ratings yet
Huffman Coding: Brief Theory
6 pages
Huffman Code
No ratings yet
Huffman Code
47 pages
Chapter 3 Multimedia Data Compression
100% (2)
Chapter 3 Multimedia Data Compression
23 pages
KMA SS05 Kap03 Compression
No ratings yet
KMA SS05 Kap03 Compression
54 pages
Huffman Coding Ms 140400147 Sadia Yunas Butt
No ratings yet
Huffman Coding Ms 140400147 Sadia Yunas Butt
9 pages
Huffman Encoding: Farhad Muhammad Riaz
No ratings yet
Huffman Encoding: Farhad Muhammad Riaz
17 pages
2.7. Huffman Cod
No ratings yet
2.7. Huffman Cod
12 pages
Assignment No-05
No ratings yet
Assignment No-05
3 pages
Huffman Code
No ratings yet
Huffman Code
7 pages
Lesson - Huffman and Entropy Coding
No ratings yet
Lesson - Huffman and Entropy Coding
31 pages
Huffman Code
No ratings yet
Huffman Code
25 pages
Algorithmics: Information Coding Techniques
No ratings yet
Algorithmics: Information Coding Techniques
44 pages
Huffman Encoding: WWW - Cis.Upenn - Edu/ Matuszek/Cit594-2002/SLIDES/HUFFMAN
No ratings yet
Huffman Encoding: WWW - Cis.Upenn - Edu/ Matuszek/Cit594-2002/SLIDES/HUFFMAN
13 pages
Huffman Code
No ratings yet
Huffman Code
29 pages
Data Compression
No ratings yet
Data Compression
28 pages
Huffman
No ratings yet
Huffman
17 pages
Chapter 4 Lossless Compression Algorithims
No ratings yet
Chapter 4 Lossless Compression Algorithims
30 pages
Data Compression
No ratings yet
Data Compression
12 pages
Priority Queue
No ratings yet
Priority Queue
24 pages
Huffman Coding
No ratings yet
Huffman Coding
16 pages
Huffman Coding
No ratings yet
Huffman Coding
3 pages
12 - Huffman Coding Algorithm
No ratings yet
12 - Huffman Coding Algorithm
16 pages
20 Compression
No ratings yet
20 Compression
58 pages
Unit 2
No ratings yet
Unit 2
28 pages
Term Paper Huffman Coding
No ratings yet
Term Paper Huffman Coding
9 pages
What Is Huffman Coding and Its History
No ratings yet
What Is Huffman Coding and Its History
5 pages
Data Structure: Huffman Tree:Project Submitted To: Sir Abdul Wahab
No ratings yet
Data Structure: Huffman Tree:Project Submitted To: Sir Abdul Wahab
24 pages
Lecture 26
No ratings yet
Lecture 26
2 pages
Design and Analysis of Algorithms (COM336) : Huffman Coding
No ratings yet
Design and Analysis of Algorithms (COM336) : Huffman Coding
1 page
Why Needed?: Without Compression, These Applications Would Not Be Feasible
No ratings yet
Why Needed?: Without Compression, These Applications Would Not Be Feasible
11 pages
Compression For Sending and Storing Information: Text, Audio, Images, Videos
No ratings yet
Compression For Sending and Storing Information: Text, Audio, Images, Videos
28 pages
Mad Unit 3-Jntuworld
No ratings yet
Mad Unit 3-Jntuworld
53 pages
AKTU Syllabus CS 3rd Yr
No ratings yet
AKTU Syllabus CS 3rd Yr
1 page
Text Processing: Data Structures and Algorithms in Java 1/47
No ratings yet
Text Processing: Data Structures and Algorithms in Java 1/47
47 pages
Addonmore Apartment List
No ratings yet
Addonmore Apartment List
77 pages
Mini Project 2
No ratings yet
Mini Project 2
4 pages
Trace 1
No ratings yet
Trace 1
2,718 pages
Unit3 Ece MMC 6th Sem
No ratings yet
Unit3 Ece MMC 6th Sem
96 pages
3 +배출량+산정계획서+작성+가이드라인 (2024 2)
No ratings yet
3 +배출량+산정계획서+작성+가이드라인 (2024 2)
257 pages
Image Compression
No ratings yet
Image Compression
114 pages
Sdklog
No ratings yet
Sdklog
212 pages
Trace
No ratings yet
Trace
318 pages
Multimedia Systems Chapter 7
No ratings yet
Multimedia Systems Chapter 7
21 pages
A Study of The Evolution of Video Codec
No ratings yet
A Study of The Evolution of Video Codec
13 pages
Chapter 6
No ratings yet
Chapter 6
42 pages
Chapter 4 DIP
No ratings yet
Chapter 4 DIP
17 pages
Chapter 7 Mmedia
No ratings yet
Chapter 7 Mmedia
26 pages
Report On Image Compression Using JPEG Algorithm
No ratings yet
Report On Image Compression Using JPEG Algorithm
8 pages
Discrete Cosine Transform: Theory and Applications in Image Compression
No ratings yet
Discrete Cosine Transform: Theory and Applications in Image Compression
44 pages
Heap Hard Question - WWW - Neon.rip
No ratings yet
Heap Hard Question - WWW - Neon.rip
20 pages
Jpeg Compression
No ratings yet
Jpeg Compression
17 pages
LZW Algorithm
No ratings yet
LZW Algorithm
3 pages
3er Año Santillana Física
No ratings yet
3er Año Santillana Física
4 pages
Compressing and Decompressing Data Using Zlib
No ratings yet
Compressing and Decompressing Data Using Zlib
9 pages
1.3.1 Compression, Encryption and Hashing
No ratings yet
1.3.1 Compression, Encryption and Hashing
4 pages
Peculiarities of 3D Compression of Noisy Multichannel Images
No ratings yet
Peculiarities of 3D Compression of Noisy Multichannel Images
4 pages
Range Coding
No ratings yet
Range Coding
6 pages
Assigment 4
No ratings yet
Assigment 4
2 pages
Diskpart List Disk: Want To Format Your System Drive
No ratings yet
Diskpart List Disk: Want To Format Your System Drive
1 page
Media Info
No ratings yet
Media Info
2 pages
Free Export Settings
No ratings yet
Free Export Settings
5 pages
Entropy Coding - Wikipedia
No ratings yet
Entropy Coding - Wikipedia
2 pages

05 Compression

Uploaded by

05 Compression

Uploaded by

CS 1501

Given a representation of some information, encode that

● Can get more use out a disk of a given size

● Can be grouped into two broad categories…

● Information is permanently lost in the compression process

● Input can be recovered from compressed data exactly

○ zip files, FLAC

Given a representation of some information, encode the same

● Works on arbitrary bit strings, but pretty easily explained

● What if we used variable length codewords instead of the

● Decoding was easy for block codes

● Codes must be prefix free

○ No code can be a prefix of any other in the scheme

○ Using this, we can achieve compression by:

■ Using fewer bits to represent more common characters

■ Using longer codes to represent less common characters

● Assume we have K characters that are used in the file to be

● Need to efficiently be able to select lowest weight trees to

private static void writeBit(boolean bit) {

● To encode/decode, we'll need to read in characters and

output codes/read in codes and output characters

○ Sounds like we'll need a symbol table!

■ What implementation would be best?

● Same for encoding and decoding?

○ Note that this means we need access to the trie to expand a

● Option 1: Preprocess the file to be compressed

● Option 2: Use a static trie

● Option 3: Adaptive Huffman coding

● ASCII requires 8m bits to store m characters

● Where does Huffman fall short?

● Could represent the previously mentioned string as:

● Note that this incredible compression performance is based

● Huffman used variable-length codewords to represent

● Need to avoid the same problems as the use of a static trie

for Huffman encoding…

● So use an adaptive algorithm and build up our patterns and

codewords as we go through the file

● Initialize codebook to all single characters

● Compress, using 12 bit codewords:

Cur Output Add T 84 TT:265

● Initialize codebook to all single characters

Note this means no

Cur Output Add 84 T 264:OT

● Both compression and expansion construct the same

○ Compression stores character string → codeword

○ Expansion stores codeword → character string

○ They contain the same pairs in the same order

■ Hence, the codebook doesn’t need to be stored with the

compressed file, saving space

● Expansion's codebook will always be a step "behind"

● Compress, using 12 bit codewords: AAAAAA

Cur Output Add

Cur Output Add

● How to represent/store during:

● How long should codewords be?

● This sounds eerily like variable length codewords…

● What happens when we run out of codewords?

● In general, LZW will give better compression

● Most dedicated compression applications use other algorithms:

● How much can they compress a file?

● Yes! Using Shannon Entropy

● Founded by Claude Shannon in his paper "A Mathematical

● Translating a language into binary, the entropy is the

You might also like