大语言模型
大语言模型
Instructor Name: : 刘晔
Grade:
Instructor's
Signature:
1
SCUT Future Tech
2
SCUT Future Tech
Engineering Design》
1 Background Explanation
1.1 Background
1
SCUT Future Tech
China:
(1)Research and Development:
China has emerged as a significant player in the field of artificial
2
SCUT Future Tech
International Context:
(1)Global Competition:
The development of large language models is a highly competitive
field with contributions coming from various countries around the world.
Besides China, the United States, Europe, Canada, and other Asian
3
SCUT Future Tech
1.3 Objectives
4
SCUT Future Tech
5
SCUT Future Tech
2 Project Process
1. cd /root/
2. wget -U "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like
Gecko) Chrome/58.0.3029.110 Safari/537.36"
https://mirrors.tuna.tsinghua.edu.cn/anaconda/archive/Anaconda3-2023.07-1-Linux-
x86_64.sh --no-check-certificate #下载
3. bash Anaconda3-2023.07-1-Linux-x86_64.sh #执行安装
4. conda -V #新建终端 查看版本是否 23.5.2 版本
5. conda info #查看是否成功
6
SCUT Future Tech
1. cd /root/
2. unzip ChatGLM2-6B.zip #解压
1. cd /root/ChatGLM2-6B/chatglm2-6b #切入目录
2. #清华源:https://cloud.tsinghua.edu.cn/d/674208019e314311ab5c/?p=%2Fchatglm2-
6b&mode=list
3. #执行以下命令下载 pytorch_model-*.bin 7 个文件
4. wget --no-check-certificate + 下载链接
1. cd /root/work/ChatGLM2-6B
2. pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple #执行此命令使
用清华源安装
(6)Launch web_demo.py
Before launching web_demo.py, make the following code modifications:
The code value "THUDM/chatglm2-6b" is the path to load the model
implementation. To use your own model file path, modify it
(/home/root/ChatGLM2-6B/chatglm2-6b).
1. #修改为自己模型文件路径
tokenizer = AutoTokenizer.from_pretrained("/root/ChatGLM2-6B/chatglm2-6b",
trust_remote_code=True)
2. #模型量化
model = AutoModel.from_pretrained("/root/ChatGLM2-6B/chatglm2-
6b",trust_remote_code=True).quantize(4).cuda()
7
SCUT Future Tech
1. cd /home/aistudio/work/ChatGLM2-6B
2. python web_demo.py #启动
8
SCUT Future Tech
9
SCUT Future Tech
1. #安装微调所需依赖库
!pip install rouge_chinese nltk jieba datasets transformers[torch] -i
https://pypi.douban.com/simple/
1. cd /root /ChatGLM2-6B/ptuning/
2. !ls -alh ptuning #检查数据集
10
SCUT Future Tech
best results.
The P-Tuning-v2 method freezes all model parameters, and can
be adjusted by adjusting quantization_bit to be the quantization
level of the original model, without this option it is loaded with FP16
precision.
Under the default configuration, the model parameters of INT4
are frozen, and a training iteration will perform 16 cumulative
backward and forward propagations with a batch size of 1, which is
equivalent to a total batch size of 16, and then requires a minimum
of 6.7G of RAM. If you want to improve the training efficiency under
the same batch size, you can increase the value of
per_device_train_batch_size while keeping the product of the two
unchanged, but it will also bring more graphics memory
consumption.
11
SCUT Future Tech
2. LR=2e-2
3. NUM_GPUS=1
4.
5. torchrun --standalone --nnodes=1 --nproc-per-node=$NUM_GPUS main.py \
6. --do_train \
7. --train_file dataSet/chinese_simplified_train.json \
8. --validation_file dataSet/chinese_simplified_val.json \
9. --preprocessing_num_workers 10 \
10. --prompt_column summary \
11. --response_column title \
12. --overwrite_cache \
13. --model_name_or_path /root/ChatGLM2-6B/chatglm2-6b \
14. --output_dir output/adgen-chatglm2-6b-pt-$PRE_SEQ_LEN-$LR \
15. --overwrite_output_dir \
16. --max_source_length 256 \
17. --max_target_length 128 \
18. --per_device_train_batch_size 1 \
19. --per_device_eval_batch_size 1 \
20. --gradient_accumulation_steps 16 \
21. --predict_with_generate \
22. --max_steps 1500 \
23. --logging_steps 10 \
24. --save_steps 500 \
25. --learning_rate $LR \
26. --pre_seq_len $PRE_SEQ_LEN \
27. --quantization_bit 4
Parameter description:
PRE_SEQ_LEN: This is the maximum length of the input sequence,
can be adjusted according to your dataset appropriately large or
small, here choose 128.
12
SCUT Future Tech
13
SCUT Future Tech
14
SCUT Future Tech
possible.
(6)model training
15
SCUT Future Tech
16
SCUT Future Tech
17
SCUT Future Tech
(4)Executive reasoning
(3)Model Evaluation
After the model inference is completed, we need to evaluate the
model to understand the performance and accuracy of the model.
Model evaluation can be done by comparing with labeled data to
calculate metrics such as accuracy, recall, F1 score, etc. The results
of the evaluation can help us understand the strengths and
weaknesses of the model and improve and adjust the model.
18
SCUT Future Tech
19
SCUT Future Tech
(4)Implementation assessment
20
SCUT Future Tech
3 Summary
I find that the evaluation result is higher than the given fine-
tuning evaluation baseline. After fine-tuning, the model outperforms
the original model on several evaluation metrics (Bleu-4, Rouge-1,
Rouge-2, Rouge-L). This means that the fine-tuning done helps to
improve the performance of the model, making it more accurate or
more in line with expectations when generating text or
accomplishing other tasks.
21