ComfyUI_VLM_nodes

ComfyUI_VLM_nodes
★ 567

视觉语言模型图像描述自动提示生成关键词提取
为ComfyUI提供视觉-语言模型与LLM自定义节点,自动生成图像描述、提示与关键词,提升提示创意与一致性
💡 自动为图像生成描述与创意提示,优化ComfyUI提示
🍴 59 Forks💻 Python🔄 2026-01-11
📦
网盘下载
复制链接后前往夸克网盘下载
https://pan.quark.cn/s/6862a2001521
📦 requirements.txt
accelerate>=1.0
bitsandbytes
cffi
decord
diffusers
>=0.31.0
diskcache
einops>=0.7.0
gitpython
huggingface-hub>=0.26.2
matplotlib
moviepy
numpy>=1.26.4,<2.0.0
openai>=0.27.8
opencv-python
optimum>=1.17.0
pillow>=9.4.0
py-cpuinfo>=3.3.0
python-dateutil>=2.7.0
pytz
qwen-vl-utils
safetensors>=0.4.1
scikit-build
six
soundfile
symusic
torch>=2.0.1
torchvision>=0.15.2
transformers>=4.46
structured
image
image
image
image
VLM + LLM
image
image
image
image
image
image
image
📄 README

Usage

  • For Windows and Linux
  • cd custom_nodes
    git clone https://github.com/gokayfem/ComfyUI_VLM_nodes.git

    Acknowledgements

  • JAGS
  • EnragedAntelope
  • If you get errors related to llama-cpp-python or if it is not using GPU.

    I recommend installing it with the right arguments provided in this link llama-cpp-python

    Tools

    | Tool | Description |

    |——|————-|

    | DualView | Free side-by-side comparison tool for VLM outputs, images, videos, and AI prompts |

    VLM Nodes

    Utilizes “llama-cpp-python“ for integration of LLaVa models. You can load and use any VLM with LLaVa models in GGUF format with this nodes.

    You need to download the model similar to “ggml-model-q4_k.gguf` and it's clip projector similar to `mmproj-model-f16.gguf“ from this repositories (in the files and versions).