Stable Diffusion - ControlNet 插件中扩展局部重绘 InpaintOnly + LaMa 算法与应用

欢迎关注我的CSDN：https://spike.blog.csdn.net/
本文地址：https://spike.blog.csdn.net/article/details/131643131

LaMa: https://github.com/advimman/lama

Paper: Resolution-robust Large Mask Inpainting with Fourier Convolutions

LaMa: Large Mask inpainting

尽管现代图像修复系统已经取得了显著的进步，但是在处理大面积缺失、复杂的几何结构和高分辨率图像方面，常常面临挑战。其中一个主要的原因是修复网络和损失函数中缺乏有效的接收视野。为了解决这个问题，提出了一种新的方法，称为大面积 Mask 修复（LaMa），主要基于：

一种新的修复网络架构，使用快速傅里叶卷积（FFCs），具有全图像的接收视野；高接收视野的感知损失；大量训练 Mask，释放前两个部分的潜力。

LaMa 修复网络在一系列数据集上改进了最新的技术水平，并且在面临挑战的情况下，例如完成周期性结构，也取得了优秀的性能。LaMa 模型令人惊讶地能够很好地适应训练时未见过的更高分辨率，且在参数和时间成本上比竞争基线更低。

注意：一定要选择 更偏向ControlNet 模式，否则不起作用，

测试效果：

1. 基础图像

启动 SD 服务命令：

conda deactivate
source venv/bin/activate
# python launch.py --port 9301 --xformers
nohup python -u launch.py --port 9301 --xformers > nohup.sd.out &

ControlNet 版本：v1.1.231，已升级至最新版本：

cd stable-diffusion-webui/extensions/sd-webui-controlnet
git pull

再重启服务。

模型是墨优人造人，输入定制化的提示词与配置：

1girl,moyou,best quality,detailed,8k hdr,RAW,intricate details,chiaroscuro,drop shadow,
(cosmetics:1.1),(rim light:1.2),
solo,(face details:1.3),(light green hair:1.1),eyes,hair accessories,
standing on the ground,full body,fashionable clothing,school uniform,
huge chest,lacteal sulcus,sneakers,on the bustling streets,(pinkshoes:1.2),short skirt
Negative prompt: EasyNegative,(badhandv4:1.2)
Steps: 30, Sampler: DDIM, CFG scale: 7, Seed: 777766374, Face restoration: CodeFormer, Size: 512x768, Model hash: 6a226dd292, Model: 墨幽人造人_v1010_完整版, Denoising strength: 0.2, Hires upscale: 2, Hires upscaler: 8x_NMKD-Superscale_150000_G, Version: v1.4.0

默认 512x768 的输出图像：

2. 扩展图像

注意：控制模式 一定要选 更偏向ControlNet，才能启用 LAMA 功能。

将图像存储之后，放入 ControlNet 插件，并且启用，配置如下：

选择： 完美像素模式；控制类型，选择： 局部重绘(Inpaint) 预处理器，选择：inpaint_only+lama；模型，选择：control_v11p_sd15_inpaint。控制模式：更偏向ControlNet，可以生成更多细节，更好启用 LAMA 功能。缩放模式：缩放后填充空白。

其他参数配置：

采样方法，选择： DDIM，即模型推荐方法。迭代步数，选择： 30~50 宽度和高度：注意，如果要生成图像较宽，即宽度:高度 > 2:1，建议拆分2次进行，以避免生成多个人像。即原图512x768 - 1024x768 - 1536x768，重复2次操作，即可。

即：

通过 2 次扩展，即512x768 - 1024x768 - 1536x768，输出 1536x768 的图像：

3. 提升细节

使用 ControlNet 的 Tile 模式，可以有效去除边缘过渡。

将图像转换至 图生图 模式，配置参数：

提示词保持不变。选择 面部修复，因为重绘，就要选择 面部修复。重绘尺寸倍数，选择：2倍，即从1536x768 - 3072x1536 重绘幅度，选择：0.6 其余默认或与模型相关。

即：

ControlNet 插件的 Tile 功能的配置：

启用：完美像素模式；控制类型，选择 Tile (分块)；预处理器，选择：tile_resample；模型，选择 control_v11f1e_sd15_tile；其余保持默认。

即：

最终效果：

其中，绿色手提包的局部细节对比，如下：

4. 其他测试

模型：Dreamshaper_7

4.1 提示词

修改提示词，去除人物部分，增加场景描述，即：

((masterpiece)),raw photo,(photo realistic:1.2),(extremely detailed),high detail,
sharp focus,
streets full of sense of crisis,modern buildings with a sense of science and technology,
detailed buildings,detailed streets,
cyberpunk,halo,cyberpunk art,holography,
[(colorful explosion psychedelic paint colors:1.25)::0.25],
lora:more_details:1.2,lora:ClothingAdjuster2:-0.65,

负向提示词，可以保持不变，即：

UnrealisticDream,BadDream,
ng_deepnegative_v1_75t,bad_prompt_version2-neg,EasyNegative,(badhandv4:1.2),
logo,watermark,signature,username,letterbox,symbol,text box,censored,
multiple girls,2 girls,2 females,2 women,
missing toes,too many toes,extra fingers,missing fingers,fused fingers,too many fingers,mutated hands,malformed hands,poorly drawn hands,bad hands,
extra limbs,malformed limbs,floating limbs,disconnected limbs,missing arms,missing legs,extra arms,extra legs,mutated legs,long neck,
bad anatomy,bad proportions,disfigured,long body,
mutated,deformed,dehydrated,ugly,
naked,nipples,cleavage,
door frame,window frame,mirror frame,out of frame,
cropped,blurry,out of focus,monochrome,worst quality,low quality,jpeg artifacts,

4.2 配置缩放

选择：缩放后填充空白，同时调整宽高 2048x1024，长图，即：

4.3 ADetailer

脸部：

detailed face,perfect face,portrait,close up,

手部：

detailed hands,detailed fingers,detailed nail, hands close up,

注意，在 LAMA 模式下，ADetailer 不起作用。

4.4 使用 Inpaint Only + LAMA 扩展

注意：一定要选择 更偏向ControlNet 模式，否则不起作用，即：

4.5 使用 Tile 模式精修

控制模式：选择 更偏向ControlNet 模式，即：

提示词，包括人物和场景的融合：

((masterpiece)),raw photo,(photo realistic:1.2),(extremely detailed),high detail,
sharp focus,
beautiful eyes,detailed eyes,detailed face,perfect facial details,perfect mouth,detailed lips,
solo,1girl,woman,asian girl,violet hair,long hair,
a girl with long hair and a blue dress and bling pantyhose standing in front of a neon background with stars and circles,
streets full of sense of crisis,modern buildings with a sense of science and technology,
detailed buildings,detailed streets,
more clothes,detailed clothes,colorful clothes,exquisite clothes,detailed shoes,perfect shoes,exquisite shoes,
focus on shoes detail,focus on fingers detail,
full body,full body photo,standing,
cyberpunk,halo,cyberpunk art,holography,
[(colorful explosion psychedelic paint colors:1.25)::0.25],
lora:more_details:1.2,lora:ClothingAdjuster2:-0.65,

效果对比，细节更多：

其他

miaoshouai-assistant (喵手助理) 插件

添加 miaoshouai-assistant，安装链接，扩展 - 从网址安装：

https://ghproxy.com/https://github.com/miaoshouai/miaoshouai-assistant.git

遇到 Bug，TypeError: 'type' object is not subscriptable：

File "stable-diffusion-webui/extensions/miaoshouai-assistant/scripts/runtime/msai_prelude.py", line 116, in MiaoshouPrelude
        def ENV_EXCLUSION(self) -> list[str]:
    TypeError: 'type' object is not subscriptable

源码：miaoshouai-assistant/scripts/runtime/msai_prelude.py，返回类型错误，修改即可：

# def ENV_EXCLUSION(self) -> list[str]:
def ENV_EXCLUSION(self) -> list:

暂时并未使用。