0% found this document useful (0 votes)
27 views17 pages

Bda Avro

The document outlines various data compression algorithms supported by the Apache Hadoop framework, including DEFLATE, GZIP, BZIP2, LZO, LZ4, and Snappy. Each algorithm is described in terms of its compression ratio, speed, and suitability for different use cases. Additionally, it provides information on file formats and whether they support multiple splittable files.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views17 pages

Bda Avro

The document outlines various data compression algorithms supported by the Apache Hadoop framework, including DEFLATE, GZIP, BZIP2, LZO, LZ4, and Snappy. Each algorithm is described in terms of its compression ratio, speed, and suitability for different use cases. Additionally, it provides information on file formats and whether they support multiple splittable files.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Hadoop I/O

Data Compression Supported by Apache Hadoop Framework

• DEFLATE

• GZIP (GNU Zip)

• BZIP2

• LZO(Lempel-Ziv-Oberhumer)

• LZ4

• Snappy
•DEFLATE: A standard compression algorithm offering a good balance between
compression ratio and speed.

•GZIP (GNU Zip): Uses DEFLATE with .gz file format, providing good compression at the
cost of speed.

•BZIP2: Offers high compression ratio with slower performance, suitable for archival
purposes.

•LZO (Lempel-Ziv-Oberhumer): Focuses on fast compression and decompression with


lower compression ratio.

•LZ4: Prioritizes very fast compression and decompression with a modest compression
ratio.

•Snappy: Designed for high-speed compression and decompression with reasonable


compression ratios.
COMPRESSION FILENAME MULTIPLE
TOOL ALGORITHM SPLITTABLE
FORMAT EXTENSION FILES

DEFLATE N/A DEFLATE .deflate No No


GZIP GZIP DEFLATE .gz No No
Yes, at
ZIP zip DEFLATE .zip Yes file
boundaries
bzip2 bzip2 bzip2 .bz2 No Yes
LZO lzop LZO .lzo No No

You might also like