comfyui-flux-accelerator

★ 140

加速Flux.1TAEF1量化与编译优化

ComfyUI 的自定义节点，通过引入 TAEF1、量化与 torch.compile，以及跳过冗余 DiT 模块，在可接受画质下显著加速 Flux.1 图像生成。

💡 在 ComfyUI 中快速生成 Flux.1 图像以节省时间和算力。

🍴 5 Forks💻 Python🔄 2024-12-19

🔗 GitHub 原文

📦

网盘下载

复制链接后前往夸克网盘下载

https://pan.quark.cn/s/b6135d9bd930

📦 requirements.txt

torchao
triton
xformers

📄 README

🍭 ComfyUI Flux Accelerator

Note

日本語のREADMEはこちらです。

ComfyUI Flux Accelerator is a custom node for ComfyUI that accelerates Flux.1 image generation, just by using this node.

How does ComfyUI Flux Accelerator work?

ComfyUI Flux Accelerator accelerates the generation of images by:

Using TAEF1.

TAEF1 is a fast and efficient AutoEncoder that can encode and decode pixels in a very short time, in exchange for a little bit of quality.

Quantization and Compilation.

ComfyUI Flux Accelerator utilizes torchao and torch.compile() to optimize the model and make it faster.

Skipping redundant DiT blocks.

ComfyUI Flux Accelerator offers an option to skip redundant DiT blocks, which directly affects the speed of the generation.

You can choose the number of blocks to skip in the node (default is 3, 12 of MMDiT blocks).

How much faster is ComfyUI Flux Accelerator?

ComfyUI Flux Accelerator can generate images up to _37.25%_ faster than the default settings.

Here are some examples (tested on RTX 4090):

512×512 4steps: 0.51s → 0.32s (37.25% faster)

1024×1024 4steps: 1.94s → 1.24s (36.08% faster)

1024×1024 20steps: 8.77s → 5.74s (34.55% faster)

How to install ComfyUI Flux Accelerator?

Clone this repository and place it in the custom_nodes folder of ComfyUI

“`bash

git clone https://github.com/discus0434/comfyui-flux-accelerator.git

mv comfyui-flux-accelerator custom_nodes/

“`

Install PyTorch and xFormers

“`bash

## Copied and modified https://github.com/facebookresearch/xformers/blob/main/README.md

# cuda 11.8 version

pip3 install -U torch torchvision torchao triton xformers –index-url https://download.pytorch.org/whl/cu118

# cuda 12.1 version

pip3 install -U torch torchvision torchao triton xformers –index-url https://download.pytorch.org/whl/cu121

# cuda 12.4 version

pip3 install -U torch torchvision torchao triton xformers –index-url https://download.pytorch.org/whl/cu124

“`

Download TAEF1 with the following command

“`bash

cd custom_nodes/comfyui-flux-accelerator

chmod +x scripts/download_taef1.sh

./scripts/download_taef1.sh

“`

Launch ComfyUI

_Launch command may vary depending on your environment._

a. If you have H100, L40 or more newer GPU

“`bash

python main.py –fast –highvram –disable-cuda-malloc

“`

b. If you have RTX 4090

“`bash

python main.py –fast –highvram

“`

c. Otherwise

“`bash

python main.py

“`

Load the workflow in the workflow folder

_You can load the workflow by clicking the Load button in the ComfyUI._

Enjoy!

How to use ComfyUI Flux Accelerator?

Just use the FluxAccelerator node in the workflow, and you’re good to go!

_If your GPU has less than 24GB VRAM, you may encounter frequent Out Of Memory errors when changing parameters. But simply ignore them and run again and it will work!_

What are the limitations of ComfyUI Flux Accelerator?

ComfyUI Flux Accelerator has the following limitations:

Image Quality

ComfyUI Flux Accelerator sacrifices _a little bit_ of quality for speed by using TAEF1 and skipping redundant DiT layers. If you need high-quality images, you may want to use the default settings.

Compilation Time

ComfyUI Flux Accelerator may take _30-60 seconds_ to compile the model for the first time. This is because it uses torch.compile() to optimize the model.

Compatibility

ComfyUI Flux Accelerator is now only compatible with Linux.

License

ComfyUI Flux Accelerator is licensed under the MIT License. See LICENSE for more information.