Static Int Const Unsigned Char Const Unsigned Char Int Unsigned Char
Static Int Const Unsigned Char Const Unsigned Char Int Unsigned Char
static int compare_win(const unsigned char *window, const unsigned char *buffer, int *offset,
unsigned char *next)
(Window is the pointer of the sliding windows, buffer is the pointer of the look ahead buffer, offset
is the pointer which will return the pointer of the first position of the matching byte from the
sliding windows, next is the pointer which will return the next byte move to the sliding windows,
function will return the longest no of bytes after matching found in sliding windows)
1. Initialize the offset, although it is valid once a match is found.
*offset=0;
2. if no match found, prepare to return 0 and next symbol in the look ahead buffer
longest=0;
*next=buffer [0];
3. Look for the best match in the look-ahead buffer and sliding window.
a. k=0;
b. While(k move into the sliding window until reach to the end byte of the window from
the initial byte)
c. /*start outer while loop here */
{
i=k;
j=0;
match=0;
d. Determine how many symbols match in the sliding window at offset k.
e. While ( i is less than size of sliding window and j is less than size of look ahead buffer)
{
if ( byte of sliding window in position i is does not match with the position of j in look
ahead buffer)
break from the loop;
else
match++;
i++;
j++;
}
4. if(match>longest)
{
*offset=k;
longest=match;
*next=buffer[j];
}
5. } /* end of the outer while loop*/
6. Return longest.
7. End of the function.
LZ77
to
compress
buffer
of
data
specified
by
compressed
is
unknown
the
caller,
lz77_Compress
dynamically
needed.
It begins by writing the number of symbols in the data to the buffer
of compressed data and initializing the sliding window and look ahead
buffer. The look ahead buffer is then loaded with symbols.
Compression takes place inside of a loop that iterates until there are
no more symbols to process. We use ipos to keep track of the current
byte being processed in the original data, and opos to keep track of
the current bit we are writing to the buffer of compressed data.
During each iteration of the loop, we call compare_win to determine
the longest phrase in the look-ahead buffer that matches one in the
sliding window. The compare_win function returns the length of the
longest match. When a match is found, compare_win sets offset to the
position of the match in the sliding window and next to the symbol in
the look ahead buffer immediately after the match. In this case, we
write a phrase token to the compressed data.
Phrase tokens in the implementation presented here require 12 bits for
offsets because the size of the sliding window is 4k (4096 bytes).
Phrase token requires 5 bits for length because no match will exceed
the length of the look ahead buffer, which is 32 bytes. If a match no
found, compare_win returns 0 and sets next to the unmatched symbol at
the start of the look ahead buffer. In this case, we write a symbol
token to the compressed data, before actually writing the token, we
call the network function btonl as a convenient way to ensure that the
token is in big-endian format.
Structure of a phrase token
1
length (5 bits)
26 bits
Structure of a symbol token
0