ComfyUI-Kimi-VL

ComfyUI-Kimi-VL
★ 1

多模态模型集成推理节点易于安装
为 ComfyUI 提供 Kimi-VL 模型节点,支持图像与文本的多模态推理与管理,便于在流水线中无缝集成与调用。
💡 在 ComfyUI 流水线中加载并运行 Kimi-VL 多模态推理。
🍴 1 Forks💻 Python🔄 2025-04-17
📦
网盘下载
复制链接后前往夸克网盘下载
https://pan.quark.cn/s/a9fb3a59e10c
📦 requirements.txt
torch
torchvision
transformers>=4.45.0
pillow
tiktoken
accelerate
blobfile
📄 README

ComfyUI-Kimi-VL

Make Kimi-VL avialbe in ComfyUI.

Kimi-VL: Mixture-of-Experts Vision-Language Model for Multimodal Reasoning, Long-Context Understanding, and Strong Agent Capabilities.

Installation

  • Make sure you have ComfyUI installed
  • Clone this repository into your ComfyUI’s custom_nodes directory:
  • cd ComfyUI/custom_nodes
    git clone https://github.com/Yuan-ManX/ComfyUI-Kimi-VL.git

  • Install dependencies:
  • cd ComfyUI-Kimi-VL
    pip install -r requirements.txt

    Model

    🤗 For general multimodal perception and understanding, OCR, long video and long document, video perception, and agent uses, we recommend Kimi-VL-A3B-Instruct for efficient inference; for advanced text and multimodal reasoning (e.g. math), please consider using Kimi-VL-A3B-Thinking.

    | Model | #Total Params | #Activated Params | Context Length | Download Link |

    | :————: | :————: | :————: | :————: | :————: |

    | Kimi-VL-A3B-Instruct | 16B | 3B | 128K | 🤗 Hugging Face |

    | Kimi-VL-A3B-Thinking | 16B | 3B | 128K | 🤗 Hugging Face |

    [!Note]

    Recommended parameter settings:

    – For Thinking models, it is recommended to use Temperature = 0.6.

    – For Instruct models, it is recommended to use Temperature = 0.2.