Finezip:: Pushing The Limits of Large Language Models For Practical Lossless Text Compression

Uploaded by

satyamkr.verma27

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

42 views7 pages

Finezip:: Pushing The Limits of Large Language Models For Practical Lossless Text Compression

Uploaded by

satyamkr.verma27

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

FineZip : Pushing the Limits of Large Language Models for

Practical Lossless Text Compression

Fazal Mittu1 , Yihuan Bu1 , Akshat Gupta1 , Ashok Devireddy1 , Alp Eren Ozdarendeli1 ,
Anant Singh2 , Gopala Anumanchipalli1
1
UC Berkeley, 2 NYU
[email protected]

Abstract large language models (LLMs) can be used to com-

press data from various modalities. Huang et al.
While the language modeling objective has (2024) followed up this work by showing that in-
been shown to be deeply connected with com-
arXiv:2409.17141v1 [cs.CL] 25 Sep 2024

creasing compression abilities of LLMs is linearly

pression, it is surprising that modern LLMs are
not employed in practical text compression sys- correlated to downstream task performance.
tems. In this paper, we provide an in-depth anal- Previous works have exploited this connection
ysis of neural network and transformer-based for lossless text compression. Neural network
compression techniques to answer this question. based models have been implemented for text com-
We compare traditional text compression sys- pression (Schmidhuber and Heil, 1996; Mahoney,
tems with neural network and LLM-based text 2000; Goyal et al., 2018) and have reached better
compression methods. Although LLM-based compression performance than traditional algorith-
systems significantly outperform conventional
compression methods, they are highly imprac-
mic compressors such as gzip. More recent meth-
tical. Specifically, LLMZip, a recent text com- ods have explored using LSTM and transformer
pression system using Llama3-8B requires 9.5 models (Bellard, 2019, 2021). These methods fall
days to compress just 10 MB of text, although under the "online" compressors category, where
with huge improvements in compression ra- a randomly initialized model is directly trained
tios. To overcome this, we present FineZip - a on the data being compressed. In this case, the
novel LLM-based text compression system that model parameters also become part of the compres-
combines ideas of online memorization and dy-
sion. A recent effort, LLMZip (Valmeekam et al.,
namic context to reduce the compression time
immensely. FineZip can compress the above 2023), tested the use of LLMs for lossless compres-
corpus in approximately 4 hours compared to sion. Given an LLM’s ability to predict the next
9.5 days, a 54 times improvement over LLMZip token provided a fixed-length context window, a
and comparable performance. FineZip out- tokenized text can be stored as probabilistic ranks
performs traditional algorithmic compression produced by an LLM predicting the next token.
methods with a large margin, improving com- This is a type of "offline" compression, with a fixed
pression ratios by approximately 50%. With system used for both compression and decompres-
this work, we take the first step towards making
lossless text compression with LLMs a reality.
sion of all incoming text.
While FineZip presents a significant step in In this paper, we build on prior work and intro-
that direction, LLMs are still not a viable solu- duce FineZip, which uses LLMs for lossless text
tion for large-scale text compression. We hope compression with both online and offline compo-
our work paves the way for future research and nents. FineZip combines an "online" component,
innovation to solve this problem. which memorizes the data being compressed, with
an "offline" component in the form of pre-trained
LLMs for compression. The "online" memoriza-
1 Introduction
tion is done by fine-tuning the model on the data
While the relationship between language modeling being compressed in a parameter-efficient way (Hu
and compression has long been known (Schmidhu- et al., 2021; Dettmers et al., 2023) with an addi-
ber and Heil, 1996; Mahoney, 2000; Goyal et al., tional constant overhead of the learned embeddings
2018; Bellard, 2019), recent works (Delétang et al., during fine-tuning. The "offline" component of the
2024; Huang et al., 2024) have reinforced this con- system is the pre-trained LLM which remains fixed
nection. Delétang et al. (2024) recently showed across different corpora. Figure 1 depicts the sys-
Figure 1: System diagram for FineZip.

tem diagram for FineZip. With this approach, we acter in a word occupies 8 bits (1 byte in UTF-8
can leverage the benefits of online compression for encoding), representing the word as a token, es-
improved performance without the drawback of sentially converting it into a number, will almost
requiring additional storage for model parameters. always reduce the number of bytes needed to repre-
Additionally, with FineZip we allow for a dy- sent it. This connection was also observed in Delé-
namic context where each token being compressed tang et al. (2024). As a next step, we can use the
has a context size of equal to its position in a sen- predictive capabilities of LLMs for compression.
tence. This allows us to batch compression and This idea is used in LLMZip (Valmeekam et al.,
decompression steps using LLMs, allowing for sig- 2023) where they use a pre-trained LLM for text
nificant speed-up. "Online memorization" using compression. The connection between language
PEFT methods also allows the model to compen- modeling and compression becomes intuitive when
sate for loss of performance due to a dynamic con- we take a deeper look at the language modeling
text, while a dynamic context allows for batching objective, implemented using a cross-entropy loss.
which allows compression and decompression of It aims to make each token in the training data the
many batches of text in parallel within a fixed com- most probable token given the context preceding
pute budget. With FineZip, we can achieve 54 it, thus minimizing the number of bits required to
times faster compression times with minor loss of represent the rank of the token in the vocabulary
performance when compared to LLMZip, still out- list, when ranked in descending order according to
performing traditional text compression methods their probability. Following this line of thought, we
by a huge margin. Our work also shows that com- propose an intuitive yet effective way of enhanc-
pression rates of LLM-based methods are still not ing this - fine-tuning the model on the data being
low enough for practical use cases, and although compressed.
FineZip pushes the limits of using LLMs loss- A challenge towards fine-tuning modern LLMs
less text compression in practice, much work still is that they are memory-intensive. Additionally, if
needs to be done. The code for our work can be we fine-tune the entire model on the text being com-
found here - https://github.com/fazalmittu/ pressed, then the entire LLM becomes part of the
FineZip. compression, requiring an additional space equal
to the space required to store the model for decom-
2 Introducing FineZip pression. Thus, we propose FineZip, a compres-
sion framework that involves parameter-efficient
The most basic form of compression using LLMs fine-tuning (PEFT) (Mangrulkar et al., 2022) on the
would be to tokenize the input text. Since each char- input text as an "online" step prior to compression.
We call this the "online memorization" step which Method Compression Ratio Time (min)
zlib 0.3251 0.0083
makes the data being compressed more probable gzip 0.3238 0.0141
for the LLM. This fine-tuning is implemented using bzip2 0.2374 0.0437
LoRA (Hu et al., 2021) and is much faster than full NNCP 0.15021 251
LLMZip (AC) 0.0795 13571
fine-tuning, requires much less GPU memory, and LLMZip 0.1163 13651
requires a very small amount of additional storage Finezip (AC) 0.0797 13118
for the trained embeddings. The additional embed- Finezip 0.12799 250
Finezip-4bit 0.1445 67
ding storage does not scale with the dataset being
compressed and becomes negligible at large sizes Table 1: Comparison of Compression Methods on 10mb
of corpora.
Another key difference between LLMZip and
FineZip is that FineZip adopts a dynamic context Modifications to LLMZip: LLMZip originally
size approach rather than maintaining a fixed slid- used Llama-1-7B (Touvron et al., 2023a) while
ing window. LLMZip uses a permanent sliding we leverage Llama-3-8B for both LLMZip and
window approach, where the rank of each token FineZip for uniform comparison. Additionally,
produced has a fixed context window of a preset LLMZip used two methods for compression - one
context size (512 as chosen by original authors). using arithmetic coding (AC) and the other using
This by design makes the compression process ex- a secondary compression methods on generated
tremely autoregressive and non-parallelizable, as to ranks. LLMZip uses zlib (Jean-loup Gailly, 2024)
produce the rank of a token, you need the previous as a secondary compression method over ranks
512 tokens. whereas our experiments show that bzip2 provides
a much better compression ratio (Appendix: A.1).
FineZip overcomes this limitation by employing a
Thus, we use bzip2 as our secondary compres-
two-step dynamic context window technique:
sion method for LLM ranks in both LLMZip and
1. Divide the corpus into chunks of a pre-decided FineZip. We also refer to bzip2 as the baseline
window length. for text compression using traditional compression
methods (Table 1). To offer a better comparison,
2. Produce the ranks of each token within the
we also create a version of FineZip that incorpo-
window such that the rank for the ith token is
rates arithmetic coding. The process uses the logits
produced based on the tokens preceding it
that the LLM outputs for each new token as the
The dynamic context window gives a variable probability distribution update for the arithmetic
context size to each token in a chunk. For a uni- coding scheme.
form comparison, we use a chunking size of 512 in We used the first 10mb of the enwik8 (Marcus
FineZip, which is the same as the context window Hutter, 2006) dataset which is a standard bench-
size chosen by LLMZip. In FineZip, the ith token mark for compression tasks. Though compression
in a chunk has a context size of i − 1, thus only ratio (ratio of compressed file size and original file
the final token in a chunk has access to full context size) is the key metric, we are also interested in
length of 512. In contrast, every token in LLMZip measuring time taken by these compression sys-
has access to the full context length of 512. The tems to evaluate practicality. The results are shown
dynamic context leads to some loss of performance, in Table 1. The first key observation is that neu-
which is made up for by online memorization. ral network and LLM based compression methods
have significantly better compression ratios than
3 Experiments
traditional text compression methods (zlib, gzip,
We begin by comparing FineZip with (i) tradi- bzip2), thus highlighting the potential impact of
tional text compression methods - bzip2 (Julian these methods for text compression. The second
Seward, 2024), zlib (Jean-loup Gailly, 2024), and key observation is that neural network and LLM
gzip (Jean-loup Gailly, 1992), (ii) neural network based methods takes a long time to compress even
based text compression methods - NNCP (Bellard, small amounts of text, thus preventing their use in
2021), and the (iii) recent LLM-based text compres- practice. This is especially true when using AC
sion method called LLMZip. For both FineZip for compression in LLM-based methods, which
and LLMZip, we use Llama-3 8B (Dubey et al., produces exceptional compression ratios but also
2024). requires unprecedentedly large amounts of time.
Figure 3: Compressing 10mb dataset with LLama-3
Figure 2: FineZip ablations for different fine-tune 8B loaded with 4, 8, 16, and 32-bit precision. Purple
epochs bar shows compression ratio, red line shows time taken
to compress. Each batch size was chosen to max out
For LLMZip with AC, the time taken to compress memory on a 48GB GPU.
10MB of data is approximately 9.5 days. Thus,
we do not explore AC-based LLM compression is able to mitigate the loss in performance. We
further and strictly compare only rank-based LLM further push the limits of compression time using
baselines. quantization. We perform the memorization step
Table 1 shows that FineZip is able to achieve using QLoRA (Dettmers et al., 2023) and perform
comparable or better compression ratios than both compression using the quantized model. We do
NNCP and LLMZip with a much faster compres- this using a fixed compute budget of 48GB GPU
sion time. Specifically, we see that FineZip has memory on a single NVIDIA A6000 GPU. Lower
a much better compression ratio than NNCP with precision models will allow us to increase batch
comparable amount of compression time, while size and in turn, decrease time needed to compress
the 4-bit quantized FineZip is approximately 4 a file by a sizable amount. Figure 3 shows that fine-
times faster than NNCP and still exhibits a better tuning/compressing a 4 bit model allows us to fit a
compression ratio. FineZip compresses enwik8 batch size of 70 on one A6000 GPU and achieve a
within 4 hours, compared to approximately 227 compression time of 67 minutes. This 4x speed up
hours taken by LLMZip. This is a 54x improve- makes FineZip not only a competitive compressor
ment on compression time with a minor drop of 1 out-performning traditional text compression sys-
percentage point in compression ratio. tems by a huge margin, but also the fastest neural
network/transformer based compression currently
3.1 FineZip Ablations available.
FineZip uses an "online memorization step" as
shown in Figure 1 before performing compression.
4 Conclusion
This is done using Low-Rank Adaptation (LoRA)
(Hu et al., 2021). We compare the effect of fine-
In this paper we explore the possibility of using
tuning on compression using 3 different language
LLMs for lossless text compression. We show that
models: GPT2-XL 1.3B (Radford et al., 2019),
while using neural network and LLM based text
LLama-2 7B (Touvron et al., 2023b), and LLama-
compression systems lead to significantly better
3 8B (Dubey et al., 2024). We see that for each
compression rates, they also require impractical
model, memorization improves the absolute com-
amounts of compression time. To alleviate this,
pression ratio by at least 1 percentage point or a
we introduce FineZip - an LLM-based lossless
relative improvement of about 8% over its non-
text compression system which compresses text 54
fine-tuned baseline as shown in Figure 2. This is
times faster than LLMZip with a minor loss in com-
significant especially when dealing with such low
pression performance. FineZip also improves on
compression rates. It should be noted that the time
the compression ratio of traditional text compres-
taken for memorization is negligible compared to
sion systems by approximately 50%. We also show
compression time and can be ignored.
that while FineZip presents a significant step in
Quantization: We saw in Table 1 that dynamic making practical text compression systems using
context helps speed up the compression process LLMs, much still needs to be done. We hope our
by significant amounts, while online memorization work can serve as a stepping stone in that direction.
5 Limitations Julian Seward. 2024. bzip2 - a free and open-source file
compression program. Accessed: 2024-06-01.
LLM-based text compression systems assume a
GPU being available in the host machine for lo- Matthew V. Mahoney. 2000. Fast text compression with
neural networks. In Proceedings of the Thirteenth
cal compression. While this is not true for every International Florida Artificial Intelligence Research
personal computer, the landscape is rapidly chang- Society Conference, pages 230–234. AAAI Press.
ing. Many personal laptops are now equipped with
GPUs and as compute becomes cheaper and the Sourab Mangrulkar, Sylvain Gugger, Lysandre De-
but, Younes Belkada, Sayak Paul, and Benjamin
power of LLMs grow, we envision a future where Bossan. 2022. Peft: State-of-the-art parameter-
every personal computer will be equipped with an efficient fine-tuning methods. https://github.
LLM running locally and performing various tasks. com/huggingface/peft.
Marcus Hutter. 2006. enwik8. http://prize.
hutter1.net/index.htm. Accessed: 2024-08-15.
References
Fabrice Bellard. 2019. Lossless data compression with Alec Radford, Jeffrey Wu, Rewon Child, David Luan,
neural networks. Dario Amodei, Ilya Sutskever, et al. 2019. Language
models are unsupervised multitask learners. OpenAI
Fabrice Bellard. 2021. Nncp v2: Lossless data compres- blog, 1(8):9.
sion with transformer.
J. Schmidhuber and S. Heil. 1996. Sequential neural
J. Cleary and I. Witten. 1984. Data compression using text compression. IEEE Transactions on Neural Net-
adaptive coding and partial string matching. IEEE works, 7(1):142–146.
Transactions on Communications, 32(4):396–402.
Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier
Grégoire Delétang, Anian Ruoss, Paul-Ambroise Martinet, Marie-Anne Lachaux, Timothée Lacroix,
Duquenne, Elliot Catt, Tim Genewein, Christo- Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal
pher Mattern, Jordi Grau-Moya, Li Kevin Wenliang, Azhar, et al. 2023a. Llama: Open and effi-
Matthew Aitchison, Laurent Orseau, Marcus Hut- cient foundation language models. arXiv preprint
ter, and Joel Veness. 2024. Language modeling is arXiv:2302.13971.
compression. Preprint, arXiv:2309.10668.
Hugo Touvron, Louis Martin, Kevin Stone, Peter Al-
Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, and bert, Amjad Almahairi, Yasmine Babaei, Nikolay
Luke Zettlemoyer. 2023. Qlora: Efficient finetuning Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti
of quantized llms. Preprint, arXiv:2305.14314. Bhosale, et al. 2023b. Llama 2: Open foundation
and fine-tuned chat models, 2023. URL https://arxiv.
Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, org/abs/2307.09288.
Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman,
Akhil Mathur, Alan Schelten, Amy Yang, Angela Chandra Shekhara Kaushik Valmeekam, Krishna
Fan, et al. 2024. The llama 3 herd of models. arXiv Narayanan, Dileep Kalathil, Jean-Francois Chamber-
preprint arXiv:2407.21783. land, and Srinivas Shakkottai. 2023. Llmzip: Loss-
less text compression using large language models.
Google. 2024. Brotli compression algorithm. Accessed: Preprint, arXiv:2306.04050.
2024-06-01.
Mohit Goyal, Kedar Tatwawadi, Shubham Chandak,
and Idoia Ochoa. 2018. Deepzip: Lossless data com-
pression using recurrent neural networks. Preprint,
arXiv:1811.08162.
Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan
Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and
Weizhu Chen. 2021. Lora: Low-rank adaptation of
large language models. Preprint, arXiv:2106.09685.
Yuzhen Huang, Jinghan Zhang, Zifei Shan, and Junx-
ian He. 2024. Compression represents intelligence
linearly. Preprint, arXiv:2404.09937.
Jean-loup Gailly. 1992. Gzip. http://www.gzip.org.
Accessed: 2024-08-15.
Jean-loup Gailly. 2024. Zlib: A massively spiffy yet
delicately unobtrusive compression library. http:
//www.zlib.net. Accessed: 2024-08-15.
A Appendix tional compression techniques (Brotli, BZ2, PPM)
to create a benchmark for ourselves. Figure 3
A.1 Evaluating Traditional Compression
shows that Brotli, BZ2, and PPM perform con-
Methods
sistently across varying input file sizes and that
We first experimented with three traditional com- PPM performs the best on textual data, reaching a
pression methods - Brotli (Google, 2024), BZ2 compression ratio of approximately 0.25. Figure 4
(Julian Seward, 2024), and PPM (Cleary and Wit- measures the compression ratio when two compres-
ten, 1984) - for text compression as a function of sion techniques are stacked and serves as a more
increasing dataset size. We find that PPM performs accurate benchmark for FineZip as it also employs
best for text compression, and that the performance two step compression. Through these set of base-
remains relatively constant with respect to dataset line experiments, we can see that a compression
size. The results can be seen in Figure 4. ratio of 0.25 is the value to beat.

Figure 4: Evaluating Baseline Compression Techniques

Brotli, BZ2, and PPM on enwik8 Figure 6: Evaluating Stacked Compression with Brotli,
BZ2, and PPM on enwik8
We then use these algorithms to compress the
ranks generated by LLMs in FineZip. We see that
BZ2 has the best performance so we chose it as the A.3 Context Size
traditional compression method for FineZip. To determine the best context window size to use,
we ran experiments with the LLama2-7B base
model (LLMZip) and discovered that a larger con-
text size results in a better compression ratio. The
compression ratio began to plateau as the context
window reached 512 so we decided to use that for
all of our experimentation.

Figure 5: Testing Traditional Compression Techniques

Brotli, BZ2, and PPM on the ranks produced by com-
pressing enwik8 with LLama2-7B finetuned for 64
epochs with r=16
A.2 Double Compression Benchmark
Figure 7: Evaluating Best Context Window for Com-
Prior to testing FineZip, we compressed the en-
pression
wik8 (Marcus Hutter, 2006) dataset using tradi-
Figure 8: Compressing input files of size 1, 10, and 100
megabytes with LLama-3 8B finetuned for 256 epochs.

A.4 FineZip and Dataset Size

The previous experiments were only using a dataset
size of 10mb and for this to be a viable com-
pression technique, it has to scale well for much
smaller and larger file sizes. Figure 8 shows that
LLama-3 8B (Dubey et al., 2024) fine-tuned for
256 epochs maintains a consistent compression ra-
tio on dataset sizes of 1, 10, and 100mb. This
verifies that FineZip remains viable regardless of
dataset size.

Building LLMs - Stanford
No ratings yet
Building LLMs - Stanford
78 pages
Coc Exam Level4
86% (7)
Coc Exam Level4
3 pages
Little Guide To Building Large Language Models in 2024
100% (1)
Little Guide To Building Large Language Models in 2024
65 pages
Research Paper Llama
No ratings yet
Research Paper Llama
27 pages
Challenges and Applications of Large Language Models: Desi GN Behavior
No ratings yet
Challenges and Applications of Large Language Models: Desi GN Behavior
72 pages
A Survey of Small Language Models
No ratings yet
A Survey of Small Language Models
20 pages
A Comprehensive Survey of Compression Algorithms For Language Models - 2024 - Park Et Al
No ratings yet
A Comprehensive Survey of Compression Algorithms For Language Models - 2024 - Park Et Al
35 pages
Model Compression and Efficient Inference For Large Language Models: A Survey
No ratings yet
Model Compression and Efficient Inference For Large Language Models: A Survey
47 pages
The Refinedweb Dataset For Falcon LLM: Outperforming Curated Corpora With Web Data, and Web Data Only
No ratings yet
The Refinedweb Dataset For Falcon LLM: Outperforming Curated Corpora With Web Data, and Web Data Only
32 pages
LLM-Pruner: On The Structural Pruning of Large Language Models
No ratings yet
LLM-Pruner: On The Structural Pruning of Large Language Models
18 pages
LLMLingua Compressing Prompts LLM Jiangetal
No ratings yet
LLMLingua Compressing Prompts LLM Jiangetal
19 pages
2023 LLMBC Whats Next
No ratings yet
2023 LLMBC Whats Next
95 pages
NeurIPS 2023 LLM Pruner On The Structural Pruning of Large Language Models Paper Conference
No ratings yet
NeurIPS 2023 LLM Pruner On The Structural Pruning of Large Language Models Paper Conference
19 pages
LLM Compression Techniques
No ratings yet
LLM Compression Techniques
14 pages
Online Embedding Compression For Text Classification Using Low Rank Matrix Factorization
No ratings yet
Online Embedding Compression For Text Classification Using Low Rank Matrix Factorization
9 pages
LLMLingua 2
No ratings yet
LLMLingua 2
18 pages
LLaMA Open and Efficient Foundation Language Models
No ratings yet
LLaMA Open and Efficient Foundation Language Models
27 pages
Small Language Models (SLMS)
No ratings yet
Small Language Models (SLMS)
23 pages
LLaMA Ankit - Rawat
No ratings yet
LLaMA Ankit - Rawat
52 pages
Trend
No ratings yet
Trend
47 pages
Ba LLMS W3 S2 2024 2025
No ratings yet
Ba LLMS W3 S2 2024 2025
64 pages
LLM-Pruner: On The Structural Pruning of Large Language Models
No ratings yet
LLM-Pruner: On The Structural Pruning of Large Language Models
20 pages
Language Modeling Is Compression
No ratings yet
Language Modeling Is Compression
16 pages
FPTQ: F - P - T Q - L L M: INE Grained OST Raining Uantiza Tion For Arge Anguage Odels
No ratings yet
FPTQ: F - P - T Q - L L M: INE Grained OST Raining Uantiza Tion For Arge Anguage Odels
17 pages
A Multi-Perspective Analysis of Memorization in Large Language Models
No ratings yet
A Multi-Perspective Analysis of Memorization in Large Language Models
18 pages
L M I C: Anguage Odeling S Ompression
No ratings yet
L M I C: Anguage Odeling S Ompression
17 pages
2024 Findings-Eacl 141
No ratings yet
2024 Findings-Eacl 141
17 pages
S LLM: D - S Q: Queeze Ense AND Parse Uantization
No ratings yet
S LLM: D - S Q: Queeze Ense AND Parse Uantization
21 pages
Loma: Lossless Compressed Memory Attention: Yang Et Al. 2023
No ratings yet
Loma: Lossless Compressed Memory Attention: Yang Et Al. 2023
8 pages
Slice GPT
No ratings yet
Slice GPT
22 pages
Lan - Guage Mo - Del Cheat Sheet
100% (2)
Lan - Guage Mo - Del Cheat Sheet
3 pages
Llms 1 15
No ratings yet
Llms 1 15
15 pages
Cache and Distil
No ratings yet
Cache and Distil
14 pages
Machine Learning Systems With Reduced Memory Requirements
No ratings yet
Machine Learning Systems With Reduced Memory Requirements
41 pages
AQLM
No ratings yet
AQLM
18 pages
Llmzip: Lossless Text Compression Using Large Language Models
No ratings yet
Llmzip: Lossless Text Compression Using Large Language Models
8 pages
Little Guide To Building Large Language Models in 2024
No ratings yet
Little Guide To Building Large Language Models in 2024
65 pages
Fine-Tuning and Deploying Large Language Models Over Edges Issues and Approaches
No ratings yet
Fine-Tuning and Deploying Large Language Models Over Edges Issues and Approaches
7 pages
Data Structure Using C by Mamata Garanayak 238c40
50% (4)
Data Structure Using C by Mamata Garanayak 238c40
460 pages
Synopsis-FINEZIP-research Paper-3
No ratings yet
Synopsis-FINEZIP-research Paper-3
2 pages
Ali 等 - 2024 - Prompt-SAW Leveraging Relation-Aware Graphs for Textual Prompt Compression
No ratings yet
Ali 等 - 2024 - Prompt-SAW Leveraging Relation-Aware Graphs for Textual Prompt Compression
16 pages
Code Explanation
No ratings yet
Code Explanation
8 pages
Tokenization
No ratings yet
Tokenization
34 pages
Revisiting Simple Neural Probabilistic Language Models (2021)
No ratings yet
Revisiting Simple Neural Probabilistic Language Models (2021)
8 pages
Konigstein2 v-8 ScrambledContent Chapter-9
No ratings yet
Konigstein2 v-8 ScrambledContent Chapter-9
10 pages
Model Pretraining
No ratings yet
Model Pretraining
11 pages
CIKM
No ratings yet
CIKM
173 pages
Synopsis-LLMZIP-research Paper-2
No ratings yet
Synopsis-LLMZIP-research Paper-2
2 pages
21mic0107 VL2024250101857 Da
No ratings yet
21mic0107 VL2024250101857 Da
10 pages
NLP Transformer Class Notes
No ratings yet
NLP Transformer Class Notes
3 pages
Compressing Large Scale Transformer Based Models - A Case Study On BERT
No ratings yet
Compressing Large Scale Transformer Based Models - A Case Study On BERT
7 pages
Model Compression Techniquesin Deep Learning
No ratings yet
Model Compression Techniquesin Deep Learning
23 pages
Comprehensive Complexity Assessment of Emerging Learned Image Compression On Cpu and Gpu
No ratings yet
Comprehensive Complexity Assessment of Emerging Learned Image Compression On Cpu and Gpu
5 pages
NNCP v2.1
No ratings yet
NNCP v2.1
5 pages
Optimising LLMs
No ratings yet
Optimising LLMs
8 pages
Large Language Model Lifecycle
No ratings yet
Large Language Model Lifecycle
2 pages
2025 04 22 Intro To LLMsv1
No ratings yet
2025 04 22 Intro To LLMsv1
41 pages
Compress Review
No ratings yet
Compress Review
14 pages
LLM4BeSciV2 2024 04 29T13 - 02 - 01.601Z
No ratings yet
LLM4BeSciV2 2024 04 29T13 - 02 - 01.601Z
25 pages
ICASSP - 2025 - Copie
No ratings yet
ICASSP - 2025 - Copie
5 pages
Manifold Learning For LLM Compression
No ratings yet
Manifold Learning For LLM Compression
4 pages
Intergraph Smart™ 3D Installation & Configuration Checklist
100% (1)
Intergraph Smart™ 3D Installation & Configuration Checklist
9 pages
Fanuc 10m
100% (1)
Fanuc 10m
3 pages
Pci-Express (Peripheral Component Interconnect) : Root Complex
No ratings yet
Pci-Express (Peripheral Component Interconnect) : Root Complex
5 pages
Linux Basics and Kernel Programming
No ratings yet
Linux Basics and Kernel Programming
165 pages
Can V5.0: Logicore Ip Product Guide
No ratings yet
Can V5.0: Logicore Ip Product Guide
68 pages
IT 5020 Advanced Database Technologies
No ratings yet
IT 5020 Advanced Database Technologies
4 pages
Unit 5-Key - Value Store Database
No ratings yet
Unit 5-Key - Value Store Database
16 pages
CHAPTER-6 File and IO
No ratings yet
CHAPTER-6 File and IO
40 pages
SC Technical Detaiks
No ratings yet
SC Technical Detaiks
226 pages
Osms
No ratings yet
Osms
38 pages
Explain The Data Transfer Between Memory and Cpu
No ratings yet
Explain The Data Transfer Between Memory and Cpu
2 pages
O3D3xx Operations Manual Preliminary
No ratings yet
O3D3xx Operations Manual Preliminary
72 pages
Alcatel OS6600 AOS 5.4.1 R01 Command Line Reference Guide PDF
No ratings yet
Alcatel OS6600 AOS 5.4.1 R01 Command Line Reference Guide PDF
2,180 pages
Sockets
No ratings yet
Sockets
30 pages
Netapp Certification Program: Reference Document List
No ratings yet
Netapp Certification Program: Reference Document List
18 pages
Geospatial Database Management
No ratings yet
Geospatial Database Management
8 pages
Eco Gallery - Android Gallery Widget With Recycling
No ratings yet
Eco Gallery - Android Gallery Widget With Recycling
54 pages
Advocate Document
No ratings yet
Advocate Document
111 pages
ARM User Guide EFT Server v6.5
No ratings yet
ARM User Guide EFT Server v6.5
86 pages
HPE Reference Configuration For Veeam Availability Suite With HPE Nimble Storage-A00079582enw
No ratings yet
HPE Reference Configuration For Veeam Availability Suite With HPE Nimble Storage-A00079582enw
43 pages
Index Sequential Access & Prefix B+ Tree: File Structures - Module IV
No ratings yet
Index Sequential Access & Prefix B+ Tree: File Structures - Module IV
14 pages
P 4 Bwa
No ratings yet
P 4 Bwa
57 pages
DB2 Program Preperation
No ratings yet
DB2 Program Preperation
6 pages
Dag PDF
No ratings yet
Dag PDF
60 pages
Mastersaf DW Basic Infrastructure
No ratings yet
Mastersaf DW Basic Infrastructure
14 pages
Anu Sharma: Introduction To PHP, Building Blocks and Flow Control
No ratings yet
Anu Sharma: Introduction To PHP, Building Blocks and Flow Control
17 pages
Árboles Binarios de Búsqueda (ABB) : ABB Árbol Binario Ordenado Según Uno o Más Criterios Cada Nodo Tiene Dos Hijos
No ratings yet
Árboles Binarios de Búsqueda (ABB) : ABB Árbol Binario Ordenado Según Uno o Más Criterios Cada Nodo Tiene Dos Hijos
31 pages
Currency Converter Using C#
No ratings yet
Currency Converter Using C#
6 pages

Finezip:: Pushing The Limits of Large Language Models For Practical Lossless Text Compression

Uploaded by

Finezip:: Pushing The Limits of Large Language Models For Practical Lossless Text Compression

Uploaded by

FineZip : Pushing the Limits of Large Language Models for

Practical Lossless Text Compression

Abstract large language models (LLMs) can be used to com-

creasing compression abilities of LLMs is linearly

Figure 4: Evaluating Baseline Compression Techniques

Figure 5: Testing Traditional Compression Techniques

A.4 FineZip and Dataset Size

You might also like