AMD 7000系列显卡部署Stable Diffusion（Ubuntu 22.04）

OS Ubuntu 22.04.4 LTS 64-bit

GPU AMD® Radeon rx 7600

前言

Ubuntu 22.04 安装 ROCm6.0教程+踩坑http://t.csdnimg.cn/d9vLb

由于之前已经安装了ROCm6.0, 所以后面部署SD用的都是6.0的依赖，也有些问题没有解决，如开启xFormers，主要是需要

PyTorch 2.2.2+cu121 with CUDA 1201 (you have 2.4.0.dev20240424+rocm6.0)

根据MarKA的文章，似乎5.7稳定支持，所以bug可能少一点？ROCm5.7请参考原文。

本文代码的步骤大量搬运了MarKA的文章，并根据遇到的问题做了修改和补充。

一、前期准备

1. 检查ROCm信息

显卡正常显示即表示驱动安装正常

rocminfo

2. 安装conda

任意conda版本皆可，本文使用MarKA的文章提到的miniconda

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash ./Miniconda3-latest-Linux-x86_64.sh
conda config --set auto_activate_base false #禁用自动进入conda环境

3. 创建conda环境

创建一个名为sd的环境，python版本为3.10.6(版本可自行指定)：

conda create -n sd python=3.10.6 -y
conda activate sd #激活sd这个环境

[注]：删除conda环境

conda remove --name sd --all

二、安装PyTorch

1. 安装PyTorch（在sd这个环境中）

注意：安装torch请用官方源，国内源会出现import torch错误

似乎6.0已经不需要Preview (Nightly)版，可以用正式版的链接了（2024-04-25）

pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.0

Preview (Nightly)版链接长这样，如果上面url有问题就用下面的吧
https://download.pytorch.org/whl/nightly/rocm6.0

2. 复查安装情况

验证torch安装及GPU导入情况，如返回Success和True表示一切正常：

python3 -c 'import torch' 2> /dev/null && echo 'Success' || echo 'Failure'
python3 -c 'import torch; print(torch.cuda.is_available())'

如出现False请检查你的安装步骤中是否使用了第三方conda及pip源检查GPU设备名称：

python3 -c "import torch; print(f'device name [0]:', torch.cuda.get_device_name(0))"

检查显示PyTorch环境中的组件信息

python3 -m torch.utils.collect_env

这里重点看 Is CUDA available: True

Python version: 3.10.14 (main, Mar 21 2024, 16:24:04) [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-6.5.0-28-generic-x86_64-with-glibc2.35
Is CUDA available: True
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: AMD Radeon RX 7600 (gfx1102)
Nvidia driver version: Could not collect
cuDNN version: Could not collect
HIP runtime version: 6.0.32830
MIOpen runtime version: 3.0.0
Is XNNPACK available: True

三、部署Stable Diffusion WebUI

1. 克隆Stable Diffusion WebUI到本地

git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git

github访问有问题的修改host后进行下载(改为140.82.113.3)：

echo "140.82.113.3 github.com" | sudo tee -a /etc/hosts
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git

2. 进入SD目录并安装依赖

cd stable-diffusion-webui
pip install -r requirements.txt -i https://pypi.org/simple

注意：实测使用国内pip源(清华源)会出现问题，请使用官方pip源；

3. 启动测试（在SD目录）

使用最小参数进行测试：

python launch.py --listen --autolaunch

不能访问http://huggingface.io的小伙伴请用以下命令(似乎是hf的一个国内镜像)

HF_ENDPOINT=https://hf-mirror.com python launch.py --listen --autolaunch

4. 启动命令（在SD目录）

conda activate sd && cd ~/stable-diffusion-webui
HSA_OVERRIDE_GFX_VERSION=11.0.0 python3 launch.py

#启动命令后面可以带参数，常用参数
#(完整参数含义可以在官网查询：https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Command-Line-Arguments-and-Settings)
--precision full #启用全精度浮点运算
--no-half #启用半精度浮点运算，我8G,不加就爆显存
--medvram #6G/8G显存爆显存的话加上这个
--lowvram #4G显存爆显存的话加上这个
--always-batch-cond-uncond #禁用批量生成图片
--xformers #优化显存占用的插件，需额外安装

注：

HIP_VISIBLE_DEVICES=0 可以不添加，多显卡用户(例如核显+独显)出现RuntimeError: Torch is not able to use GPU问题的时候将0改为1并加到HSA_OVERRIDE_GFX_VERSION前面； HSA_OVERRIDE_GFX_VERSION= 这里要根据自己显卡型号修改： HSA_OVERRIDE_GFX_VERSION=9.0.6对应显卡型号： Radeon VII HSA_OVERRIDE_GFX_VERSION=10.3.0对应显卡型号：RX5000 / 6000系列 HSA_OVERRIDE_GFX_VERSION=11.0.0对应显卡型号：RX7000系列

四、其他

显卡监控

#显卡监控
watch -n 1 rocm-smi
(1表示每隔1秒刷新)
#如果想显示更详细的信息，安装AMD的radeontop监控软件
sudo apt install mesa-utils radeontop
sudo radeontop

关于ROCm6.1

amd官网已经更新，链接在下面，但官方文件大量使用了docker，头疼，不进一步尝试了。

五、报错

#如果出现报错：
ValueError: Unknown scheme for proxy URL URL('socks://127.0.0.1:7890/')
#说明代理或端口出错，一般是折腾系统代理导致的，执行解除所有代理命令：
unset all_proxy; unset ALL_PROXY

#爆显存提示
Kernel has requested more VGPRs than are available on this agent code: 0x2d  Aborted (core dumped)

#xFormers，ROCm6.0下暂时没解决，提示：
PyTorch 2.2.2+cu121 with CUDA 1201 (you have 2.4.0.dev20240424+rocm6.0)

参考资料：

1. AMD显卡满血Stable Diffusion无脑部署笔记(ROCm5.7.1/6.1)(SD+Fooocus+ComfyUI)(不定期更新) - MarKA的文章 - 知乎
2. AMD显卡 Ubuntu 部署Stable DIffusion WebUI基于Pytorch2.0.0 Rocm5.4.2

3. A卡2023最新AI画图教程：3A主机安装ROCm运行Stable Diffusion画图

4. Installing PyTorch for ROCm — ROCm installation (Linux)