pyarrow~=18.0.0 accelerate~=1.1.0 pydantic~=2.8.2 markdown2[all] scikit-learn~=1.2.2 requests httpx uvicorn fastapi~=0.112.4 timm~=1.0.11 tiktoken transformers_stream_generator~=0.0.4 pandas deepspeed~=0.15.4 pysubs2~=1.7.2 trl~=0.12.1 moviepy~=1.0.3 diffusers~=0.31.0



This repository adds ComfyUI custom nodes that wrap the Ovis-U1 multimodal model, exposing three primary workflows inside the ComfyUI editor.
Using the model: AIDC-AI/Ovis-U1-3B (Hugging Face). You can either:
models/ovis/AIDC-AI/Ovis-U1-3B manually (preferred for offline use), orOvis-U1 Model Loader node with automatic download enabled.Sharded weights and safetensors index files are supported. For large models ensure sufficient disk and GPU memory.
Clone this repository into your ComfyUI custom_nodes directory and install dependencies:
cd ComfyUI/custom_nodes
git clone https://github.com/neverbiasu/ComfyUI-Ovis-U1.git
cd ComfyUI-Ovis-U1
pip install -r requirements.txt
Generate high-quality images from natural language prompts. This workflow shows the minimal node chain to go from prompt to final image.
Perform captioning, visual question answering, and scene understanding. Suitable for extracting structured descriptions from images.
Instruction-guided image editing. The node implements the official three-step conditional flow (unconditional, image-only, final conditioned) to produce high-quality edits.
This project is licensed under the Apache 2.0 License. Please refer to the official license terms for the use of the Ovis-U1 model.