comfy_clip_blip_node

★ 30

CLIP图像编码多模态ComfyUI插件

提供可接收图像输入的CLIP编码器节点，将图像转为CLIP向量，便于多模态条件控制与相似性检索。

💡 在ComfyUI中将图像编码为CLIP向量，用于多模态条件或相似性检索。

🍴 8 Forks💻 Python🔄 2024-05-22

🔗 GitHub 原文

📦

网盘下载

复制链接后前往夸克网盘下载

https://pan.quark.cn/s/86a6deb1b5f6

📦 requirements.txt

timm==0.4.12
transformers==4.15.0
fairscale==0.4.4
pycocoevalcap

📄 README

A ComfyUI Node for adding BLIP in CLIPTextEncode

Announcement: BLIP is now officially integrated into CLIPTextEncode

Dependencies

[x] Fairscale>=0.4.4 (NOT in ComfyUI)

[x] Transformers==4.26.1 (already in ComfyUI)

[x] Timm>=0.4.12 (already in ComfyUI)

[x] Gitpython (already in ComfyUI)

Local Installation

Inside ComfyUI_windows_portable\python_embeded, run:

python.exe -m pip install fairscale

And, inside ComfyUI_windows_portable\ComfyUI\custom_nodes\, run:

git clone https://github.com/paulo-coronado/comfy_clip_blip_node

Google Colab Installation

Add a cell with the following code:

!pip install fairscale

!cd custom_nodes && git clone https://github.com/paulo-coronado/comfy_clip_blip_node

How to use

Add the CLIPTextEncodeBLIP node;

Connect the node with an image and select a value for min_length and max_length;

Optional: if you want to embed the BLIP text in a prompt, use the keyword BLIP_TEXT (e.g. “a photo of BLIP_TEXT”, medium shot, intricate details, highly detailed).

Acknowledgement

The implementation of CLIPTextEncodeBLIP relies on resources from BLIP, ALBEF, Huggingface Transformers, and timm. We thank the original authors for their open-sourcing.