ComfyUI-Ovis-U1

★ 4

多模态模型封装设备与dtype选择ComfyUI自定义节点

为ComfyUI封装Ovis-U1多模态模型节点，提供理解、生成与编辑三大工作流，支持设备与dtype选择，简化模型部署与调用。

💡 在ComfyUI中快速调用Ovis-U1完成多模态理解、生成与编辑工作流。

🍴 1 Forks💻 Python🔄 2025-12-29

🔗 GitHub 原文

📦

网盘下载

复制链接后前往夸克网盘下载

https://pan.quark.cn/s/e58c8376a81b

📦 requirements.txt

pyarrow~=18.0.0
accelerate~=1.1.0
pydantic~=2.8.2
markdown2[all]
scikit-learn~=1.2.2
requests
httpx
uvicorn
fastapi~=0.112.4
timm~=1.0.11
tiktoken
transformers_stream_generator~=0.0.4
pandas
deepspeed~=0.15.4
pysubs2~=1.7.2
trl~=0.12.1
moviepy~=1.0.3
diffusers~=0.31.0

📄 README

ComfyUI-Ovis-U1

This repository adds ComfyUI custom nodes that wrap the Ovis-U1 multimodal model, exposing three primary workflows inside the ComfyUI editor.

Features

Unified model support (understanding, generation, editing) through simple ComfyUI nodes

Options for device and dtype selection when loading the model (BF16 / FP16 / FP32)

Model download

Using the model: AIDC-AI/Ovis-U1-3B (Hugging Face). You can either:

Place the model under models/ovis/AIDC-AI/Ovis-U1-3B manually (preferred for offline use), or

Use the Ovis-U1 Model Loader node with automatic download enabled.

Sharded weights and safetensors index files are supported. For large models ensure sufficient disk and GPU memory.

Installation

Clone this repository into your ComfyUI custom_nodes directory and install dependencies:

cd ComfyUI/custom_nodes

git clone https://github.com/neverbiasu/ComfyUI-Ovis-U1.git

cd ComfyUI-Ovis-U1

pip install -r requirements.txt

Workflows

Text-to-Image Generation

Generate high-quality images from natural language prompts. This workflow shows the minimal node chain to go from prompt to final image.

Image Understanding (Image → Text)

Perform captioning, visual question answering, and scene understanding. Suitable for extracting structured descriptions from images.

Image Editing Workflow

Instruction-guided image editing. The node implements the official three-step conditional flow (unconditional, image-only, final conditioned) to produce high-quality edits.

License

This project is licensed under the Apache 2.0 License. Please refer to the official license terms for the use of the Ovis-U1 model.