Alpaca + Codellama 34b Full Example - Ipynb - Colab
Alpaca + Codellama 34b Full Example - Ipynb - Colab
ipynb - Colab
To run this, press "Runtime" and press "Run all" on a free Tesla T4 Google Colab instance!
To install Unsloth on your own computer, follow the installation instructions on our Github page here.
You will learn how to do data prep, how to train, how to run the model, & how to save it (eg for Llama.cpp).
%%capture
import torch
major_version, minor_version = torch.cuda.get_device_capability()
# Must install separately since Colab has torch 2.2.1, which breaks packages
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
if major_version >= 8:
# Use this for new GPUs like Ampere, Hopper GPUs (RTX 30xx, RTX 40xx, A100, H100, L40)
!pip install --no-deps packaging ninja einops flash-attn xformers trl peft accelerate bitsandbytes
else:
# Use this for older GPUs (V100, Tesla T4, RTX 20xx)
!pip install --no-deps xformers trl peft accelerate bitsandbytes
pass
https://colab.research.google.com/drive/1y7A0AxE3y8gdj4AVkl2aZX47Xu3P1wJT?usp=sharing 1/5
2024/4/25 19:12 Alpaca + Codellama 34b full example.ipynb - Colab
You passed `quantization_config` to `from_pretrained` but the model you're loading already ha
model.safetensors.index.json: 198k/198k [00:00<00:00,
100% 12.4MB/s]
100% 209.39s/it]
100% 1.54s/it]
We now add LoRA adapters so we only need to update 1 to 10% of all parameters!
model = FastLanguageModel.get_peft_model(
model,
r = 16, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128
target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
"gate_proj", "up_proj", "down_proj",],
lora_alpha = 16,
lora_dropout = 0, # Currently only supports dropout = 0
bias = "none", # Currently only supports bias = "none"
# [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes!
use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context
random_state = 3407,
max_seq_length = max_seq_length,
)
Unsloth 2023.12 patched 48 layers with 48 QKV layers, 48 O layers and 48 MLP layers.
[NOTE] To train only on completions (ignoring the user's input) read TRL's docs here.
keyboard_arrow_down
显示代码
trainer = SFTTrainer(
model = model,
tokenizer = tokenizer,
train_dataset = dataset,
dataset_text_field = "text",
max_seq_length = max_seq_length,
args = TrainingArguments(
per_device_train_batch_size = 4,
gradient_accumulation_steps = 4,
warmup_steps = 10,
max_steps = 120,
learning_rate = 2e-4,
fp16 = not torch.cuda.is_bf16_supported(),
bf16 = torch.cuda.is_bf16_supported(),
logging_steps = 1,
optim = "adamw_8bit",
weight_decay = 0.01,
lr_scheduler_type = "linear",
seed = 3407,
output_dir = "outputs",
),
)
Special tokens have been added in the vocabulary, make sure the associated word embeddings ar
Map: 51760/51760 [00:09<00:00, 5660.67
keyboard_arrow_down
显示代码
trainer_stats = trainer.train()
https://colab.research.google.com/drive/1y7A0AxE3y8gdj4AVkl2aZX47Xu3P1wJT?usp=sharing 3/5
2024/4/25 19:12 Alpaca + Codellama 34b full example.ipynb - Colab
You're using a CodeLlamaTokenizerFast tokenizer. Please note that with a fast tokenizer, us
Unsloth: `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=F
[120/120 16:48, Epoch 0/1]
Step Training Loss
1 1.589300
2 1.634000
3 1.608500
4 1.398300
5 1.652000
6 1.335000
7 1.408300
8 1.282900
9 1.352700
10 1.055300
11 1.100300
12 0.997700
13 1.052700
14 0.966900
15 0.882100
16 0.863700
17 0.846300
18 0.886200
19 0.724900
20 1.072100
21 0.856300
22 0.827600
23 0.875100
24 0.937700
25 0.886100
26 0.885200
27 0.992800
28 0.848600
keyboard_arrow_down
显示代码
keyboard_arrow_down Inference
Let's run the model! You can change the instruction and input - leave the output blank!
https://colab.research.google.com/drive/1y7A0AxE3y8gdj4AVkl2aZX47Xu3P1wJT?usp=sharing 4/5
2024/4/25 19:12 Alpaca + Codellama 34b full example.ipynb - Colab
inputs = tokenizer(
[
alpaca_prompt.format(
"Continue the fibonnaci sequence.", # instruction
"1, 1, 2, 3, 5, 8", # input
"", # output - leave this blank for generation!
)
]*1, return_tensors = "pt").to("cuda")
https://colab.research.google.com/drive/1y7A0AxE3y8gdj4AVkl2aZX47Xu3P1wJT?usp=sharing 5/5