0% found this document useful (0 votes)

15 views8 pages

LLM Research Report

This guide provides an overview of publicly available open-source models and various fine-tuning methods tailored for specific tasks. It discusses techniques such as Low-Rank Adaptation (LoRA), QLoRA, prefix fine-tuning, adapters, prompt fine-tuning, and P-tuning, emphasizing their efficiency and applicability in different scenarios. The document serves as a resource for selecting models and optimizing their performance based on available resources and requirements.

Uploaded by

umair imran

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views8 pages

LLM Research Report

Uploaded by

umair imran

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

The Complete Guide for Public

Models
& Exploring Methods for Fine
Tuning

Muhammad Umair Imran

[email protected]

National University of Computer and Emerging Sciences

January 8, 2025
Contents
Abstract 2

Open Source Models 2

Fine Tuning Methods 3

Low-Rank Adaptation (LoRA) 3

QLoRA: Quantized Low-Rank Adaptation 4

Prefix Fine Tuning 4

Adapters 5

Prompt Fine Tuning 5

P-Tuning 6

1
Abstract
This report aims to provide clear guidelines for the selection of publicly available open-
source models. It also explores various methods for fine-tuning these models depending
on specific problem cases, resource availability, and performance requirements.

Open Source Models

Publicly available open-source models can be utilized for a variety of purposes. Below
are some notable models along with their specific use cases and details:

• Llama 3.1 8B Instruct: A model specifically trained for instructions, such as

conversations.
https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct

• Mistral 7B Instruct: Trained for instruction-following tasks.

https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2

• bigscience/bloomz-560m: Best for language translation with support for over

46 languages.
https://huggingface.co/bigscience/bloomz-560m

• google/gemma-2-2b-it: A model well-suited for small-resource environments

such as running on a laptop or edge devices.
https://huggingface.co/google/gemma-2-2b-it

• tiiuae/falcon-180B-chat: Optimized for large-scale conversational applications.

https://huggingface.co/tiiuae/falcon-180B-chat

• Salesforce/xgen-7b-8k-inst: Ideal for large context windows, especially for business-

specific needs.
https://huggingface.co/Salesforce/xgen-7b-8k-inst

• 01-ai/Yi-1.5-34B-Chat: Enhances chat performance in both English and Chi-

nese.
https://huggingface.co/01-ai/Yi-1.5-34B-Chat

2
• openai-community/gpt2: A general-purpose model but requires fine-tuning with
specific instructions for optimal results.
https://huggingface.co/openai-community/gpt2

Fine Tuning Methods

PEFT: Parameter Efficient Fine-Tuning

Parameter Efficient Fine-Tuning (PEFT) refers to methods that focus on efficiently adapt-
ing large pre-trained models for specific downstream tasks without requiring full model
retraining. PEFT techniques aim to save computational resources by reducing the num-
ber of trainable parameters.

Low-Rank Adaptation (LoRA) of Language Models

Low-Rank Adaptation, or LoRA, involves freezing the pretrained model weights while
introducing trainable rank decomposition matrices into each layer of the Transformer
architecture. This approach significantly reduces the number of trainable parameters for
downstream tasks, making it a parameter-efficient method for fine-tuning large language
models.

Figure 1: Low-Rank Adaptation Architecture.

LoRA works by converting high-rank matrices into low-rank matrices, which are then

3
combined and added back into the original matrices. This enables the model to retain its
original capabilities while being fine-tuned for specific tasks. As a result, LoRA speeds
up training while maintaining the overall model performance [1].

QLoRA: Quantized Low-Rank Adaptation

In large language models (LLMs), parameter efficiency is crucial for making them more
feasible for deployment in resource-limited settings. If a model, such as a 16-bit Llama
65B parameter model, requires about 780GB of GPU memory, it can become prohibitively
expensive to use in real-world applications. One approach to resolve this is the quanti-
zation of the model. This involves reducing the model’s precision, typically from 16-bit
to 4-bit or 8-bit.

Figure 2: QLoRA

Once the model is quantized, it is then further optimized through Low-Rank Adap-
tation (LoRA). By reducing the precision and applying LoRA, the model’s performance
can be fine-tuned for specific tasks, leading to significant resource savings in both time
and cost, while retaining much of the original model’s performance capabilities.
For more details, refer to the work on QLoRA, as described by Dettmers et al. [[2]].

Prefix Fine Tuning

In this method, we don’t change the weights of the model; instead, we add a prefix during
fine-tuning. For example, if we are querying, “What is the president of Pakistan?”, we
simply add a prefix, such as [history-related], to save time and computational resources.
After fine-tuning, the model has already been adapted, so the prefix does not need to be
added during inference.
For more details on prefix tuning, see Liang and Liang [[3]].

4
Figure 3: Prefix Fine Tuning

Adapters
Instead of performing full fine-tuning, adapter layers are added between the model’s
forward layers. Full fine-tuning adjusts all parameters, requiring significant resources.
However, in the adapter approach, task-specific layers are added to the model, making it
more flexible for many use cases. While this method saves resources, adding additional
layers can make inference slightly slower due to sequential processing.
For further information on adapters, refer to Hu et al. [[4]].

Figure 4: Adapters

Prompt Fine Tuning

In prompt fine-tuning, a model is trained with manual prompts or AI-generated soft
prompts during the fine-tuning phase. This approach is computationally low-cost and
can also be tailored to specific tasks.
For example, in a sentiment analysis task, a soft prompt might look like the following:

5
- Setup: The input becomes: [soft prompt embeddings] + ”This movie review is:” +
[review text] - During training, the model adjusts [soft prompt embeddings] to better
contextualize the task. - Inference Input: [soft prompt embeddings] + ”This movie
review is:” + ”An exciting roller-coaster ride full of suspense” - Expected Output:
”Positive”
Alternatively, a hard prompt is manually appended to the input:
- Input: ”Sentiment classification: Positive or Negative. Review: An exciting roller-
coaster ride full of suspense.” - Output: ”Positive”
Further details on prompt-tuning can be found in Shah [[5]].

P-Tuning
P-Tuning involves adding computer-generated embeddings along with the task-specific
prompt for prediction. For example, input text may be structured as: [P-tuning embed-
dings] + [original task input text].
However, it is worth noting that this approach works particularly well on large models
but does not produce optimal results on small models.
For more information on P-Tuning, see Liu et al. [[6]].

Figure 5: P-Tuning

References
[1] E. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen,
”LoRA: Low-Rank Adaptation of Large Language Models,” Microsoft Corporation,
[Online]. Available: https://arxiv.org/abs/2106.09685v2. [Accessed: 7-Jan-2025].

6
[2] T. Dettmers, A. Pagnoni, A. Holtzman, and L. Zettlemoyer, ”QLoRA: Effi-
cient Finetuning of Quantized LLMs,” arXiv, May 2023. [Online]. Available:
https://arxiv.org/pdf/2305.14314.

[3] X. L. Li and P. Liang, ”Prefix-Tuning: Optimizing Continuous Prompts for Genera-

tion,” arXiv, 2021. [Online]. Available: https://arxiv.org/pdf/2101.00190. [Accessed:
07-Jan-2025].

[4] Z. Hu, L. Wang, Y. Lan, W. Xu, E.-P. Lim, L. Bing, X. Xu, S. Poria, and
R. K.-W. Lee, “LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-
Tuning of Large Language Models,” in Proc. of the 2023 Conf. on Empirical Meth-
ods in Natural Language Processing (EMNLP), Dec. 2023. [Online]. Available:
https://aclanthology.org/2023.emnlp-main.319.pdf.

[5] S. Shah, ”Prompt-Tuning: A Powerful Technique for Adapting LLMs to New Tasks,”
Medium, [Online]. Available: https://medium.com/@shahshreyansh20/prompt-
tuning-a-powerful-technique-for-adapting-llms-to-new-tasks-6d6fd9b83557.

[6] X. Liu, K. Ji, Y. Fu, W. L. Tam, Z. Du, Z. Yang, and J. Tang, ”P-
Tuning v2: Prompt Tuning Can Be Comparable to Fine-Tuning Universally
Across Scales and Tasks,” arXiv preprint arXiv:2110.07602. [Online]. Available:
https://arxiv.org/pdf/2110.07602.

[7] HuggingF ace, ”P EF T P ackageRef erence, ”[Online].Available : https :

//huggingf ace.co/docs/pef t/en/packager ef erence/pt uning.

A Practical Guide To Fast Fine-Tuning Your LLMs With Unsloth - by Dr. Ashish Bamania - Apr, 2025 - AI Advances
No ratings yet
A Practical Guide To Fast Fine-Tuning Your LLMs With Unsloth - by Dr. Ashish Bamania - Apr, 2025 - AI Advances
27 pages
Predibase Fine-Tuning LLMs Ebook
No ratings yet
Predibase Fine-Tuning LLMs Ebook
20 pages
Foundations of Large Language Models 1738142777
No ratings yet
Foundations of Large Language Models 1738142777
101 pages
GenAI Preparation
No ratings yet
GenAI Preparation
15 pages
The Ultimate Guide To Prompt Engineering From Beginner To Expert Free Resources Hands-On Practice With Practical Examples (Yadav, Chandradev) (Z-Library)
100% (1)
The Ultimate Guide To Prompt Engineering From Beginner To Expert Free Resources Hands-On Practice With Practical Examples (Yadav, Chandradev) (Z-Library)
76 pages
Lecture 3 Finetuning Part 1
No ratings yet
Lecture 3 Finetuning Part 1
85 pages
Foundations of LLM
No ratings yet
Foundations of LLM
231 pages
LLM Fince-Tuning
No ratings yet
LLM Fince-Tuning
16 pages
4 LLM Fine Tuning Techniques
No ratings yet
4 LLM Fine Tuning Techniques
8 pages
Best Practices For Fine-Tuning and Prompt Engineering LLMs - Weights & Biases LLM Whitepaper
50% (2)
Best Practices For Fine-Tuning and Prompt Engineering LLMs - Weights & Biases LLM Whitepaper
21 pages
Lora: Low-Rank Adaptation of Large Language Models
No ratings yet
Lora: Low-Rank Adaptation of Large Language Models
20 pages
When Do Prompting and Prefix-Tuning Work?
No ratings yet
When Do Prompting and Prefix-Tuning Work?
25 pages
Navigating The Landscape of Large Language Models: A Comprehensive Review and Analysis of Paradigms and Fine-Tuning Strategies
No ratings yet
Navigating The Landscape of Large Language Models: A Comprehensive Review and Analysis of Paradigms and Fine-Tuning Strategies
45 pages
Paper 2
No ratings yet
Paper 2
8 pages
Browiner MX30 Mobile DR Service Manual 2016.11.10
No ratings yet
Browiner MX30 Mobile DR Service Manual 2016.11.10
91 pages
ELREA 多个lora适配器动态选取
No ratings yet
ELREA 多个lora适配器动态选取
29 pages
A Survey On LoRA of Large Language Models
No ratings yet
A Survey On LoRA of Large Language Models
30 pages
LLM Fine Tuning
No ratings yet
LLM Fine Tuning
1 page
Test 1
No ratings yet
Test 1
9 pages
Robust and Efficient Fine-Tuning of Llms With Bayesian Reparameterization of Low-Rank Adaptation
No ratings yet
Robust and Efficient Fine-Tuning of Llms With Bayesian Reparameterization of Low-Rank Adaptation
48 pages
LLM Fine Tuning
No ratings yet
LLM Fine Tuning
16 pages
AI Frameworks and Fine-Tuning: An Overview
No ratings yet
AI Frameworks and Fine-Tuning: An Overview
10 pages
Project (8th)
No ratings yet
Project (8th)
15 pages
Aoml Projj
No ratings yet
Aoml Projj
11 pages
Loraland
No ratings yet
Loraland
27 pages
Machine Learning in Production: Master the art of delivering robust Machine Learning solutions with MLOps (English Edition)
From Everand
Machine Learning in Production: Master the art of delivering robust Machine Learning solutions with MLOps (English Edition)
Suhas Pote
No ratings yet
LoRA Techniques in LLM Fine-Tuning
No ratings yet
LoRA Techniques in LLM Fine-Tuning
27 pages
LoRA+ - Efficient Low Rank Adaptation of Large Models
No ratings yet
LoRA+ - Efficient Low Rank Adaptation of Large Models
24 pages
LLAMA AI Paper
No ratings yet
LLAMA AI Paper
18 pages
Fine Tune Llama
No ratings yet
Fine Tune Llama
20 pages
Evaluating Parameter Efficient Learning For Generation
No ratings yet
Evaluating Parameter Efficient Learning For Generation
10 pages
Llmdevdaysession 2 Final 1699896189333
No ratings yet
Llmdevdaysession 2 Final 1699896189333
52 pages
The Llama Hitchiking Guide To Local LLMs - Hackerllama
No ratings yet
The Llama Hitchiking Guide To Local LLMs - Hackerllama
13 pages
Prefix-Tuning: Optimizing Continuous Prompts For Generation
No ratings yet
Prefix-Tuning: Optimizing Continuous Prompts For Generation
15 pages
Fine-Tuning GPT For Summarization
No ratings yet
Fine-Tuning GPT For Summarization
9 pages
ACL - 2021 - Xiang Lisa Li - Prefix-Tuning Optimizing Continuous Prompts For Generation
No ratings yet
ACL - 2021 - Xiang Lisa Li - Prefix-Tuning Optimizing Continuous Prompts For Generation
16 pages
The Power of Scale For Parameter-Efficient Prompt Tuning
No ratings yet
The Power of Scale For Parameter-Efficient Prompt Tuning
15 pages
Advanced Backend Code Optimization
From Everand
Advanced Backend Code Optimization
Sid Touati
No ratings yet
Cutting Down On Prompts and Parameters: Simple Few-Shot Learning With Language Models
No ratings yet
Cutting Down On Prompts and Parameters: Simple Few-Shot Learning With Language Models
12 pages
Toc 9780138199302
No ratings yet
Toc 9780138199302
8 pages
Robotics Thesis Title
100% (3)
Robotics Thesis Title
6 pages
Fine Tuning
No ratings yet
Fine Tuning
24 pages
DSA Round Prep - Big Picture
No ratings yet
DSA Round Prep - Big Picture
55 pages
What Is A Large Language Model A Comprehensive LLMs Guide
No ratings yet
What Is A Large Language Model A Comprehensive LLMs Guide
18 pages
Why Finetuning
No ratings yet
Why Finetuning
7 pages
SPC Manual Servogun InglesV1.5
No ratings yet
SPC Manual Servogun InglesV1.5
101 pages
Pytoch Modeling
No ratings yet
Pytoch Modeling
16 pages
4 - Instruction Finetune LLM
No ratings yet
4 - Instruction Finetune LLM
5 pages
A Comparative Study Between Full-Parameter and LoRA-based
No ratings yet
A Comparative Study Between Full-Parameter and LoRA-based
8 pages
1912 Lora Low Rank Adaptation of La
No ratings yet
1912 Lora Low Rank Adaptation of La
13 pages
14 Key Skills To Master Large Language Models 1729745509
No ratings yet
14 Key Skills To Master Large Language Models 1729745509
17 pages
Building Finetuning Aimodels
No ratings yet
Building Finetuning Aimodels
41 pages
Final Report On Blood Donation Website
100% (1)
Final Report On Blood Donation Website
63 pages
Full Fine-Tuning, PEFT, Prompt Engineering, or RAG
No ratings yet
Full Fine-Tuning, PEFT, Prompt Engineering, or RAG
23 pages
BentoML Adapter Integrations for Machine Learning Frameworks: The Complete Guide for Developers and Engineers
From Everand
BentoML Adapter Integrations for Machine Learning Frameworks: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Unit 3 Tuning and Optimization Techniques
No ratings yet
Unit 3 Tuning and Optimization Techniques
5 pages
Lora - Low-Rank Adaptation of Large Language Models - 2106.09685
No ratings yet
Lora - Low-Rank Adaptation of Large Language Models - 2106.09685
26 pages
LightGBM in Practice: Definitive Reference for Developers and Engineers
From Everand
LightGBM in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
District Scilympics mECHANICS TWG
No ratings yet
District Scilympics mECHANICS TWG
11 pages
Atharv DT Web
No ratings yet
Atharv DT Web
36 pages
Bye Bye Manual Prompt Tuning 1735384024
No ratings yet
Bye Bye Manual Prompt Tuning 1735384024
8 pages
Parameter Efficient Fine-Tuning (PEFT)
No ratings yet
Parameter Efficient Fine-Tuning (PEFT)
10 pages
Excel 2010 2 Module 1 20170101
No ratings yet
Excel 2010 2 Module 1 20170101
9 pages
LTE RRC Setup Success Rate
No ratings yet
LTE RRC Setup Success Rate
5 pages
Platypus
No ratings yet
Platypus
17 pages
Sapling SDLG Messaging Clock Manual V1.0
No ratings yet
Sapling SDLG Messaging Clock Manual V1.0
53 pages
Sample Exam 2
No ratings yet
Sample Exam 2
23 pages
SE-Unit-2-Agile Development
No ratings yet
SE-Unit-2-Agile Development
20 pages
Hunting With Splunk Lab 5
No ratings yet
Hunting With Splunk Lab 5
6 pages
WT Selected Slip
No ratings yet
WT Selected Slip
22 pages
Performance Analysis of LoRA Finetuning Llama-2
No ratings yet
Performance Analysis of LoRA Finetuning Llama-2
4 pages
Dse 2225 Os Midterm
No ratings yet
Dse 2225 Os Midterm
4 pages
Practical Guide to H2O.ai: Definitive Reference for Developers and Engineers
From Everand
Practical Guide to H2O.ai: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Ejemplo de Un Ensayo Sobre Educación
100% (1)
Ejemplo de Un Ensayo Sobre Educación
8 pages
FGC 1819 - Att5 Abb Xpert User Guide
No ratings yet
FGC 1819 - Att5 Abb Xpert User Guide
56 pages
Apple Inc. - Case Study
No ratings yet
Apple Inc. - Case Study
14 pages
21IQ-ITT-001 - Protocolo IQ
No ratings yet
21IQ-ITT-001 - Protocolo IQ
7 pages
NanoBeacon Config Tool User Guide EN
No ratings yet
NanoBeacon Config Tool User Guide EN
52 pages
Programming in C Unit III
No ratings yet
Programming in C Unit III
19 pages
Green Leaf Presentation Template
No ratings yet
Green Leaf Presentation Template
25 pages
Fusion Application Benefits
No ratings yet
Fusion Application Benefits
4 pages
SOE - Project Report Sample Format - Part B
No ratings yet
SOE - Project Report Sample Format - Part B
5 pages
How to use ChatGPT
From Everand
How to use ChatGPT
Bernhard Gaum
No ratings yet
Cmmi 1
No ratings yet
Cmmi 1
21 pages
Hugging Face Transformers Essentials: From Fine-Tuning to Deployment
From Everand
Hugging Face Transformers Essentials: From Fine-Tuning to Deployment
Robert Johnson
No ratings yet
Enroll Form
No ratings yet
Enroll Form
2 pages
What Is A Java Program?
No ratings yet
What Is A Java Program?
25 pages
Muhammad Faraz
No ratings yet
Muhammad Faraz
2 pages
Data Sheet Five9 Geographic Redundancy
No ratings yet
Data Sheet Five9 Geographic Redundancy
2 pages
BR2 Wallbox EN
No ratings yet
BR2 Wallbox EN
6 pages