ComfyUI-Janus_pro_vision

★ 31

图像理解多轮对话双图对比自动下载模型

在本地将Janus-Pro-7B视觉语言模型集成到ComfyUI，提供强大的图像理解、双图对比与多轮图像对话能力

💡 在ComfyUI中对图像进行详细描述与多轮交互式问答

🍴 1 Forks💻 Python🔄 2025-03-20

🔗 GitHub 原文

📦

网盘下载

复制链接后前往夸克网盘下载

https://pan.quark.cn/s/8f9eee5e2cdb

📦 requirements.txt

requests
tqdm
attrdict

📄 README

ComfyUI Janus Pro Vision

Support My Work

If you find this project helpful, consider buying me a coffee:

[](https://buymeacoffee.com/shmuelronen)

A ComfyUI custom node extension that integrates the Janus-Pro-7B vision-language model from DeepSeek AI on your’s local computer, enabling powerful image understanding and multi-turn conversation capabilities.

Vision Mode (One or two images)

Chat Mode (One or two images)

Features

🖼️ Advanced Image Analysis: Leverages Janus-Pro-7B’s capabilities for detailed image understanding and description

💬 Multi-turn Chat: Supports interactive conversations about images with context awareness

🔄 Dual Image Support: Can analyze relationships between two images simultaneously

🚀 Automatic Model Download: Downloads model files automatically on first use

⚙️ Flexible Configuration: Customizable parameters for generation and image processing

🎯 ComfyUI Integration: Seamless integration with ComfyUI workflow

Installation

Clone this repository into your ComfyUI custom nodes folder:

cd ComfyUI/custom_nodes
git clone https://github.com/ShmuelRonen/ComfyUI-Janus_pro_vision.git

Install required dependencies:

pip install requests
pip install tqdm
pip install attrdict

The model files will be automatically downloaded on first use from DeepSeek’s HuggingFace repository.

If automatic model download failes you can download them manualy to models\Janus-Pro folder:

git clone https://huggingface.co/deepseek-ai/Janus-Pro-7B

Available Nodes

1. Janus-7b-Pro Model Loader (Upload)

Handles model loading and management.

Input: None (uses default model path)

Output: JANUS_MODEL (model object for use in analyzer)

2. Janus Vision 7b Pro (Chat)

Main analysis node with chat capabilities.

Inputs:

janus_model: Model object from loader node

image_a: Primary image for analysis

image_b: (Optional) Secondary image for comparison

prompt: Text prompt/question about the image(s)

chat_mode: Enable/disable chat functionality

seed: Random seed for generation

temperature: Generation temperature (0.0 – 2.0)

top_p: Top-p sampling parameter (0.0 – 1.0)

max_tokens: Maximum generation length

image_size: Target image size for processing (512-2048)

frame_size: Border thickness for image display (1-10)

reset_chat: Clear chat history

Outputs:

response: Model’s response text

chat_history: Formatted chat history (in chat mode)

Configuration

Image Processing Parameters

image_size: Controls the maximum dimension while maintaining aspect ratio (default: 1024)

Range: 512 to 2048 pixels

Steps: 64 pixels

Example: If image is 2000x1000px and image_size=1024:

Width will be scaled to 1024

Height will be scaled proportionally to 512

frame_size: Border thickness for visual separation (default: 2)

Range: 1 to 10 pixels

Example values:

frame_size=1: Thin border

frame_size=2: Standard border

frame_size=5: Thick border

frame_size=10: Very thick border

Generation Parameters

temperature: Controls response randomness

0.1: More focused and deterministic

0.7: More creative and varied

top_p: Nucleus sampling parameter (0.95 recommended)

max_tokens: Maximum length of generated response

Model Information

This extension uses the Janus-Pro-7B model from DeepSeek AI, which offers:

Strong image understanding capabilities

Multi-turn conversation support

High-quality natural language generation

Support for image comparison and analysis

Requirements

ComfyUI

Python 3.8+

PyTorch

Transformers library

requests

tqdm

License

This project is MIT licensed. The Janus-Pro-7B model has its own license from DeepSeek AI.

Acknowledgments

DeepSeek AI for the Janus-Pro-7B model

ComfyUI community for the framework and support

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.