# LLaMA ## LLaMA ## Chinese LLaMA ### 环境配置 ```bash pip install sentencepiece pip install scikit-learn pip install --upgrade pytest pip install deepspeed ``` ### 模型转换与合并 ```bash $ python merge_llama_with_chinese_lora.py \ --base_model /root/llama-7b-hf \ --lora_model /root/chinese_llama_plus_lora_7b \ --output_dir /root/chinese-llama-7b \ --output_type pth # check sha256sum of file $ sha256sum /root/chinese-llama-7b/consolidated.00.pth # f8d380d63f77a08b7f447f5ec63f0bb1cde9ddeae2207e9f86e6b5f0f95a7955 $ python merge_llama_with_chinese_lora.py \ --base_model /root/llama-7b-hf \ --lora_model /root/chinese_llama_plus_lora_7b \ --output_dir /root/chinese-llama-7b \ --output_type huggingface Base model: /root/llama-7b-hf LoRA model(s) ['/root/chinese_llama_plus_lora_7b']: Loading checkpoint shards: 100%|█████████████████████████████████████| 2/2 [00:10<00:00, 5.33s/it] Peft version: 0.3.0 Loading LoRA for 7B model Loading LoRA /root/chinese_llama_plus_lora_7b... base_model vocab size: 32000 tokenizer vocab size: 49953 Extended vocabulary size to 49953 Loading LoRA weights Merging with merge_and_unload... Saving to Hugging Face format... $ python merge_llama_with_chinese_lora.py \ --base_model /root/llama-7b-hf \ --lora_model /root/chinese_llama_plus_lora_7b,/root/chinese_alpaca_plus_lora_7b \ --output_dir /root/chinese-llama-alpaca-7b \ --output_type huggingface Base model: /root/llama-7b-hf LoRA model(s) ['/root/chinese_llama_plus_lora_7', '/root/chinese_alpaca_plus_lora_7']: Loading checkpoint shards: 100%|█████████████████████████████████████| 2/2 [00:10<00:00, 5.08s/it] Peft version: 0.3.0 Loading LoRA for 7B model Loading LoRA /root/chinese_llama_plus_lora_7... base_model vocab size: 32000 tokenizer vocab size: 49953 Extended vocabulary size to 49953 Loading LoRA weights Merging with merge_and_unload... Loading LoRA /root/chinese_alpaca_plus_lora_7... base_model vocab size: 49953 tokenizer vocab size: 49954 Extended vocabulary size to 49954 Loading LoRA weights Merging with merge_and_unload... Saving to Hugging Face format... ``` ### 模型推理 ```bash python inference/inference_hf.py \ --base_model /root/chinese-llama-7b \ --lora_model /root/chinese-llama-7b_lora \ --interactive python inference/inference_hf.py \ --base_model /root/chinese-llama-alpaca-7b \ --with_prompt \ --interactive ``` ### 模型训练 ```bash lr=2e-4 lora_rank=8 lora_alpha=32 lora_trainable="q_proj,v_proj,k_proj,o_proj,gate_proj,down_proj,up_proj" modules_to_save="embed_tokens,lm_head" lora_dropout=0.05 pretrained_model=/model/chinese-llama-7b chinese_tokenizer_path=/model/chinese-llama-7b dataset_dir=/root/datasets data_cache=/root/cache per_device_train_batch_size=1 per_device_eval_batch_size=1 gradient_accumulation_steps=8 output_dir=/model/pstory78 deepspeed_config_file=ds_zero2_no_offload.json export TF_CPP_MIN_LOG_LEVEL=3 export HF_HUB_OFFLINE=1 torchrun --nnodes 1 --nproc_per_node 1 run_clm_pt_with_peft.py \ --deepspeed ${deepspeed_config_file} \ --model_name_or_path ${pretrained_model} \ --tokenizer_name_or_path ${chinese_tokenizer_path} \ --dataset_dir ${dataset_dir} \ --data_cache_dir ${data_cache} \ --validation_split_percentage 0.001 \ --per_device_train_batch_size ${per_device_train_batch_size} \ --per_device_eval_batch_size ${per_device_eval_batch_size} \ --do_train \ --do_eval \ --seed $RANDOM \ --fp16 \ --num_train_epochs 10 \ --report_to tensorboard \ --lr_scheduler_type cosine \ --learning_rate ${lr} \ --warmup_ratio 0.05 \ --weight_decay 0.01 \ --logging_strategy steps \ --logging_steps 50 \ --eval_steps 500 \ --evaluation_strategy steps \ --save_strategy steps \ --save_total_limit 10 \ --save_steps 500 \ --gradient_accumulation_steps ${gradient_accumulation_steps} \ --preprocessing_num_workers 8 \ --block_size 512 \ --output_dir ${output_dir} \ --overwrite_output_dir \ --ddp_timeout 30000 \ --logging_first_step True \ --lora_rank ${lora_rank} \ --lora_alpha ${lora_alpha} \ --trainable ${lora_trainable} \ --modules_to_save ${modules_to_save} \ --lora_dropout ${lora_dropout} \ --torch_dtype float16 \ --gradient_checkpointing \ --ddp_find_unused_parameters False ``` ### 问题报错解决 1. 提示:Setting `pad_token_id` to `eos_token_id`:2 for open-end generation. [Suppress HuggingFace logging warning: "Setting `pad_token_id` to `eos_token_id`:{eos_token_id} for open-end generation." - Stack Overflow](https://stackoverflow.com/questions/69609401/suppress-huggingface-logging-warning-setting-pad-token-id-to-eos-token-id) ```python generation_output = model.generate( input_ids = inputs["input_ids"].to(device), attention_mask = inputs['attention_mask'].to(device), eos_token_id=tokenizer.eos_token_id, pad_token_id=tokenizer.eos_token_id, # pad_token_id=tokenizer.pad_token_id, **generation_config ) ``` 2. RuntimeError: Failed to import transformers.testing_utils because of the following error (look up to see its traceback): No module named '_pytest' 没有安装pytest或者版本太老 ```bash pip install --upgrade pytest ```