安装llama_factory - 人工智能

第一版：

nvidia-smi git clone https://github.com/hiyouga/LLaMA-Factory.git

cd LLaMA-Factory/

llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml

pip install --upgrade huggingface_hub

llamafactory-cli webui

llamafactory-cli webui --share

llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml

huggingface-cli login

git lfs instsll

sudo apt install git-lfs

llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml

nvidia-smi

pip list

pip install transformers==4.36.2

llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml

pip install transformers==4.41.2

llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml

pip uninstall torch torchvision torchaudio

coda create -n llm python==3.10.6

conda create -n llm python==3.10.6

pip install -e ".[torch,metrics]"

llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml

第二版：

nvidia-smi

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

sh Miniconda3-latest-Linux-x86_64.sh

source .bashrc

python --version

sh Miniconda3-latest-Linux-x86_64.sh

source .bashrc

git clone https://github.com/hiyouga/LLaMA-Factory.git

conda create -n llm python==3.10.6

conda activate llm

cd LLaMA-Factory/

pip install -e ".[torch,metrics]"

pip install --upgrade huggingface_hub

huggingface-cli login

llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml

sudo apt install git-lfs

git clone https://huggingface.co/meta-llama/Meta-Llama-3-8B

总结

这两篇文章都围绕如何在使用NVIDIA GPU的Linux系统中，通过`LLaMA-Factory`项目来训练和部署LLaMA模型或其LORA权重。不过，它们各自包含了一些重复和错误的步骤，我将为你提供一个优化后的总结版，以便更清晰地理解整个过程。
### 概括的步骤
**第一步：检查GPU状态**
- 使用`nvidia-smi`命令检查NVIDIA GPU的状态。
**第二步：安装Miniconda**
- 下载并安装Miniconda（一种Python环境管理工具）来简化依赖管理。
- 使用`wget`下载Miniconda安装脚本。
- 执行安装脚本`sh Miniconda3-latest-Linux-x86_64.sh`，并根据提示完成安装。
- 更新你的Shell配置文件（如`.bashrc`），以便可以使用新安装的Python环境。
**第三步：配置新的conda环境**
- 创建一个新的conda环境，专门用于LLaMA-Factory项目，安装Python 3.10.6：
- `conda create -n llm python==3.10.6`
- 激活该环境：`conda activate llm`
**第四步：克隆LLaMA-Factory并安装依赖**
- 克隆LLaMA-Factory项目：
- `git clone https://github.com/hiyouga/LLaMA-Factory.git`
- 切换到项目目录：
- `cd LLaMA-Factory/`
- 安装LLaMA-Factory及其依赖（包括PyTorch和相关指标库）：
- `pip install -e ".[torch,metrics]"`
- 升级Hugging Face Hub库，以便上传模型：
- `pip install --upgrade huggingface_hub`
**第五步：设置Hugging Face API**
- 登录Hugging Face Hub，以便后续上传训练好的模型或数据：
- `huggingface-cli login`
**第六步：安装git-lfs（大文件支持）**
- 如果你计划克隆大型资源（如LLaMA的起点模型），则需安装git-lfs：
- `sudo apt install git-lfs`
- 完成后，你可以克隆Meta-LLaMA 3.8B模型（或其他所需的起点模型），以便在此基础上进行训练：
- `git clone https://huggingface.co/meta-llama/Meta-Llama-3-8B`
**第七步：使用LLaMA-Factory进行模型训练**
- 使用LLaMA-Factory的命令行接口(CLI)工具来启动训练过程，示例命令已包含特定YAML配置文件来训练LORA权重：
- `llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml`
**第八步（可选）：启动和共享Web UI**
- 如果需要，可以启动LLaMA-Factory的Web UI界面，以便交互式地观察和管理训练过程：
- `llamafactory-cli webui`
- 如果想分享你的训练界面（适用于团队协作），可以添加`--share`选项：
- `llamafactory-cli webui --share`
通过以上步骤，你可以在配置了NVIDIA GPU的Linux环境中，有效地利用LLaMA-Factory项目来训练和探索LLaMA模型的LORA权重。注意，具体的配置和依赖可能会有所不同，依赖于你的系统环境和项目的最新进展。