Uncond-Zero-for-ComfyUI

★ 51

无负向采样CFG替代速度友好图像锐化

在Stable Diffusion中实现无需生成任何负向（uncond）采样，通过pre_fix利用前步信息并由pre_scale控制强度，提供类似CFG的引导以提升清晰度和锐度且不显著减速

💡 在不使用负向提示的前提下提升生成质量与清晰度

🍴 8 Forks💻 Python🔄 2024-07-10

🔗 GitHub 原文

📦

网盘下载

复制链接后前往夸克网盘下载

https://pan.quark.cn/s/c73fe210bae7

📄 README

Uncond-Zero-for-ComfyUI

Allows to sample without generating any negative prediction with Stable Diffusion!

I did this as a personnal challenge: How good can a generation be without a negative prediction while following these rules:

no LCM/Turbo/Lightning or any similar method to develop the tool ✔

Nothing making the sampling noticeably slower than if using euler with a CFG scale at 1. ✔

Should work with “confusing prompts” which tends to make a mess like “macro shot of a glowing forest spirit,leafy appendages outlined with veins of light,eyes a deep,enigmatic glow amidst the foliage.,” ✔

Should allow to use a negative prompt despite not generating a negative prediction (Shout out to Clybius who helped me getting started with the maths!) ✔

Should work with max 12 steps ✔

The goal being to enhance the sampling and take even more advantages of other acceleration methods like the tensor RT engines.

With an RTX4070:

SDXL 1024×1024 / tensor rt: 9.67it/s

LCM SD 1.5 512×512 / tensor rt: 37.50it/s

⚠ Examples will be at the bottom ⚠

Nodes

Uncond Zero

To connect like a normal model patch. Generally right after the model loader.

Scale: basically similar to the CFG scale. I implemented a logic inspired from my other node AutomaticCFG with a few modifications so to adapt it to not using any negative.

“pre_fix”: Uses the previous step to modify the current one. This is the main trick to get a better quality / sharpness.

“pre_scale”: How strong will the effect be.

Recommanded: 1 for sde/ancestral samplers, 1.5 if you want to use something like dpmpp2m.

IF THE CFG SCALE IS AT 1 OR IF THERE IS NO NEGATIVE (using the ConditioningSetTimestepRange node):

does what is described above

ELSE:

Acts like the Automatic CFG

Conditioning combine positive and negative

Affects the positive conditioning with the negative.

It threats equally the negative conditioning in case you would want to use it during normal sampling but its main purpose it only for the positive.

Caveat: The combination will go as far as the shortest conditioning. Meaning the is your negative is 3 x 77 tokens and your positive only 2 * 77, only 2 / 3 of your negative will be taken into account.

Conditioning crop or fill

This node allows to use longer or shorter prompts with Tensor RT engines.

When creating a tensor rt engine, you can set the context length.

Here, “context_opt” set at 4:

This is how long your context will be. Meaning, how many times 77 tokens you can use.

The issue is that if you set it at 1, any prompt being longer will make it spam your CLI and ignore the extra.

If you set it at more than one during the creation and use a shorter conditioning it will generate noise while spamming the CLI.

So what this node does is simply allow you to set the desired context length. If your conditioning is longer it will crop it. If it is shorter it will concatenate an empty one until the length is reached.

interrupt on NaN

While I do not have seen any since the latest updates, tensor rt would sometimes throw a random black image. What this node does is that it cancels the sampling if any invalid value is detected. Also useful if you want to test Uncond Zero with bogus scales. The toggle will replace these values by 0 instead of cancelling.

Examples

(all images are workflows)

Nothing versus everything (SDXL/tensorrt), same generation speed:

SD 1.5 (merge) with LCM in 3 steps.

Vanilla / Only with the prediction scaled / “pre_fix” Enabled added / Negative prompt added:

Negative prompt integration example:

Just “bad quality” (everything after will also have “bad quality” at the end):

Summer in the negative:

Winter:

Water:

Water, autumn:

pre_fix

off / 0.5 / 1

“skill issue”

You too! Discover how this man went from a bland face

To a smiling average dude:

To this very successful businessman with five fingers!

All is the same seed. First image is “a man with a sad face” without any modification.

The second is with all the modification enabled but the prompt is only “a smiling man”.

The third one is “a smiling man wearing a suit, hiding behind a tree, hdr quality”.

Or in short: a better prompt will actually give you a better result. While it may seem obvious, in general while using a negative prediction it makes it good even when the prompt is simple. While without it, it does not. If anything that is for me the biggest (if big) caveat as I am not allowed to be as lazy as I like and forces me to add at least like two or three words in my prompts to make them better sometimes 😪.

Tips:

You can use my temperature node to change the CLIP temperature to lower/higher, it will greatly change the output!

I wouldn’t be against SOME support! 🙂

Pro tip:

Did you know that my first activity is to write creative model merging functions?

While the code is too much of a mess to be shared, I do expose and share my models. You can find them in this gallery! 😁