A Post-Training Enhanced Optimization Approach for Small Language Models

Zhai, Keke

Computer Science > Computation and Language

arXiv:2411.02939 (cs)

[Submitted on 5 Nov 2024]

Title:A Post-Training Enhanced Optimization Approach for Small Language Models

Authors:Keke Zhai

View PDF

Abstract:This paper delves into the continuous post-training optimization methods for small language models, and proposes a continuous post-training alignment data construction method for small language models. The core of this method is based on the data guidance of large models, optimizing the diversity and accuracy of alignment data. In addition, to verify the effectiveness of the methods in this paper, we used Qwen2-0.5B-Instruct model as the baseline model for small language models, using the alignment dataset constructed by our proposed method, we trained and compared several groups of experiments, including SFT (Supervised Fine Tuning) post-training experiment and KTO (Kahneman Tversky optimization) post-training experiment, as well as SFT-KTO two-stage post-training experiment and model weight fusion experiment. Finally, we evaluated and analyzed the performance of post-training models, and confirmed that the continuous post-training optimization method proposed by us can significantly improve the performance of small language models.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2411.02939 [cs.CL]
	(or arXiv:2411.02939v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2411.02939

Submission history

From: Keke Zhai [view email]
[v1] Tue, 5 Nov 2024 09:32:26 UTC (429 KB)

Computer Science > Computation and Language

Title:A Post-Training Enhanced Optimization Approach for Small Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:A Post-Training Enhanced Optimization Approach for Small Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators