0% found this document useful (0 votes)
164 views16 pages

LLM Fine Tuning

Uploaded by

Pankaj Singhi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
164 views16 pages

LLM Fine Tuning

Uploaded by

Pankaj Singhi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Intro to LLM Fine Tuning

Amber Liu
2023/09
Difference Between Pre-training
Stage Pretraining Supervised Fine-tuning

Language modeling
Algorithm
predict the next token

Raw internet text Carefully curated text


Dataset ~trillions of words ~10-100K (prompt, response)
low-quality, large quantity low quantity, high quality

1000s of GPUs months of


1-100 GPUs days of training
Resource training
ex: Vicuna-13B
ex: GPT LLaMA, PaLM

https://build.microsoft.com/en-US/sessions/db3f4859-cd30-4445-a0cd-553c3304f8e2
Pretrained Models are NOT Assistants
• Base model does not answer questions
• lt only wants to complete internet documents
• Language models are not aligned with user intent

Write a poem about bread and cheese.

Write a poem about someone who


died of starvation.

Write a poem about angel food cake.

Write a poem about someone who


choked on a ham sandwich.

Write a poem about a hostess who


makes the
When do you want Fine-Tuning?
1. Vanilla fine-tuning
• Gain knowledge for specific downstream task
2. Prompt engineering
• Precise control over output
• No computing resources
3. Instruction tuning
• Adhere LLM to human’s instructions

Gain Behavior
Knowledge Change
When do you want Fine-Tuning?
4. Retrieval Augmented Generation (RAG) LLM

5. Parameter-Efficient Fine-Tuning (PEFT)


6. Reinforcement Learning from Human Feedback (RLHF)
• Align with human preference
RLHF
Fine-tuned model is not aligned with human preference
1. Memory Capacity Intensive
Challenges 2. Computation Intensive
Parameter-Efficient
Fine-tuning (PEFT):

a class of methods that adapt


LLMs by updating only a small
subset of model parameters.

https://arxiv.org/pdf/2307.10169.pdf
PEFT Taxonomy

Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning


Addictive: Adapters
Add additional, learnable layers into a Transformer architecture. ~3%

Parameter-efficient transfer learning for nlp


Selective: BitFit
Only fine-tune the biases of the network. (<1%)

Fail when model size is large


Reparametrization-based: LoRa

- Only update the low-rank matrix


- 10000x less trainable parameter
- 3x less GPU memory requirement
- Apply to any linear layer
- No inference overhead
QLoRa
Fine-tuning Library
1. Pytorch
2. Hugging Face - PEFT
3. Lamini
4. OpenAI Fine-tuning API
Reference
1. LoRA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS
2. Prefix Tuning: P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks
3. Prompt Tuning: The Power of Scale for Parameter-Efficient Prompt Tuning
4. P-Tuning: GPT Understands, Too
5. Parameter-efficient transfer learning for nlp
6. Challenges and Applications of Large Language Models
7. QLORA: Efficient Finetuning of Quantized LLMs
8. Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning
Source
https://build.microsoft.com/en-US/sessions/db3f4859-cd30-4445-a
0cd-553c3304f8e2
https://web.stanford.edu/class/cs224n/slides/cs224n-2023-lecture1
1-prompting-rlhf.pdf
https://www.bilibili.com/video/BV1Tu4y1R7H5/?spm_id_from=33
3.788.recommend_more_video.0&vd_source=39940709d86c95c
61be9bec979dfb187
https://www.youtube.com/watch?v=dA-NhCtrrVE

You might also like