comfy_clip_blip_node

comfy_clip_blip_node
★ 30

CLIP图像编码多模态ComfyUI插件
提供可接收图像输入的CLIP编码器节点,将图像转为CLIP向量,便于多模态条件控制与相似性检索。
💡 在ComfyUI中将图像编码为CLIP向量,用于多模态条件或相似性检索。
🍴 8 Forks💻 Python🔄 2024-05-22
📦
网盘下载
复制链接后前往夸克网盘下载
https://pan.quark.cn/s/86a6deb1b5f6
📦 requirements.txt
timm==0.4.12
transformers==4.15.0
fairscale==0.4.4
pycocoevalcap
📄 README

A ComfyUI Node for adding BLIP in CLIPTextEncode

Announcement: BLIP is now officially integrated into CLIPTextEncode

Dependencies

  • [x] Fairscale>=0.4.4 (NOT in ComfyUI)
  • [x] Transformers==4.26.1 (already in ComfyUI)
  • [x] Timm>=0.4.12 (already in ComfyUI)
  • [x] Gitpython (already in ComfyUI)
  • Local Installation

    Inside ComfyUI_windows_portable\python_embeded, run:

    python.exe -m pip install fairscale

    And, inside ComfyUI_windows_portable\ComfyUI\custom_nodes\, run:

    git clone https://github.com/paulo-coronado/comfy_clip_blip_node

    Google Colab Installation

    Add a cell with the following code:

    !pip install fairscale

    !cd custom_nodes && git clone https://github.com/paulo-coronado/comfy_clip_blip_node

    How to use

  • Add the CLIPTextEncodeBLIP node;
  • Connect the node with an image and select a value for min_length and max_length;
  • Optional: if you want to embed the BLIP text in a prompt, use the keyword BLIP_TEXT (e.g. “a photo of BLIP_TEXT”, medium shot, intricate details, highly detailed).
  • Acknowledgement

    The implementation of CLIPTextEncodeBLIP relies on resources from BLIP, ALBEF, Huggingface Transformers, and timm. We thank the original authors for their open-sourcing.