comfyui_dagthomas

comfyui_dagthomas
★ 277

生成式艺术提示词辅助自定义节点工作流
comfyui_dagthomas 是一组 ComfyUI 自定义节点,简化提示词构建与海量随机艺术/摄影生成,支持24类节点与链式工作流,便于快速搭建复杂生成流水线。
💡 快速搭建随机艺术或摄影生成的 ComfyUI 流水线。
🍴 27 Forks💻 Python🔄 2025-12-13
📦
网盘下载
复制链接后前往夸克网盘下载
https://pan.quark.cn/s/b45acaa3411d
📦 requirements.txt
Pillow==10.4.0
requests==2.32.5
openai==1.44.0
blend-modes==2.1.0
huggingface_hub>=0.34.0
color_matcher==0.5.0
chardet==5.2.0
google-generativeai==0.7.2
anthropic
transformers>=4.40.0
decord>=0.6.0
scipy>=1.10.0
tqdm>=4.67.1
huggingface_hub[hf_xet]
Node Family Overview
Node Chaining Example
📄 README

comfyui_dagthomas

您可以在这里找到中文信息

plugin.aix.ink

Advanced Prompt Generation & Multi-Model AI Integration for ComfyUI

A comprehensive suite of nodes for ComfyUI featuring multi-provider LLM support (OpenAI, Gemini, Claude, Grok, Groq, QwenVL), local model inference (Phi, MiniCPM, Ollama), professional image effects, and advanced prompt generation tools.


📦 Installation

Method 1: ComfyUI Manager (Recommended)

Search for “comfyui_dagthomas” in ComfyUI Manager and click Install.

Method 2: Manual Installation

cd ComfyUI/custom_nodes
git clone https://github.com/dagthomas/comfyui_dagthomas
cd comfyui_dagthomas
pip install -r requirements.txt


🔑 API Key Configuration

Set your API keys as environment variables:

# OpenAI GPT
set OPENAI_API_KEY=sk-your-key-here

# Google Gemini
set GEMINI_API_KEY=your-key-here

# Anthropic Claude
set ANTHROPIC_API_KEY=your-key-here
# or
set CLAUDE_API_KEY=your-key-here

# xAI Grok
set XAI_API_KEY=your-key-here
# or
set GROK_API_KEY=your-key-here

# Groq
set GROQ_API_KEY=your-key-here


🧩 Node Categories

📝 Universal Nodes (Model-Agnostic)

APNext Universal Generator

Display Name: APNext Universal Generator

A model-agnostic prompt generator that automatically detects available API keys and supports all major LLM providers.

| Input | Description |

|——-|————-|

| input_text | Base text to enhance |

| model | Select provider:model or “auto-detect” |

| generation_mode | Creative, Balanced, Focused, or Custom |

| seed | Seed for reproducible variations |

| style_preference | Cinematic, Photorealistic, Artistic, etc. |

| detail_level | Brief to Very Detailed output |

Supported Models:

  • gpt:gpt-4o, gpt:gpt-4o-mini, gpt:gpt-4-turbo
  • gemini:gemini-2.5-flash, gemini:gemini-2.5-pro
  • claude:claude-sonnet-4.5, claude:claude-3-5-sonnet
  • grok:grok-beta, grok:grok-2-vision
  • groq:llama-3.3-70b-versatile
  • Returns: (generated_prompt, model_used, seed_used)


    APNext Universal Vision Cloner

    Display Name: APNext Universal Vision Cloner

    Analyze images with any supported vision model to generate detailed descriptions or clone image styles.

    | Input | Description |

    |——-|————-|

    | images | One or more images to analyze |

    | model | Vision model to use (auto-detect available) |

    | fade_percentage | Blend percentage for multiple images |

    | analysis_mode | Detailed Analysis, Style Cloning, Scene Description, Creative Interpretation |

    | output_format | Text Only, JSON Structure, or Formatted Prompt |

    Returns: (formatted_output, raw_response, faded_image, model_used)


    🤖 Google Gemini Nodes

    Gemini Prompt Enhancer

    Display Name: APNext Gemini Prompt Enhancer

    Enhances prompts with cinematic terminology and LLM refinement for video/image generation.

    | Input | Description |

    |——-|————-|

    | base_prompt | Original prompt to enhance |

    | enhancement_mode | Random Mix, Cinematic/Lighting/Camera/Motion/Style Focus, Full Enhancement, or LLM Only |

    | use_llm | Enable Gemini LLM enhancement |

    | intensity | Enhancement intensity (0.1-2.0) |

    | Optional dropdowns | visual_style, lighting_type, camera_angle, shot_size, lens_type, color_tone, etc. |

    Returns: (enhanced_prompt, random_enhanced, llm_enhanced)


    Gemini Custom Vision

    Display Name: APNext Gemini Custom Vision

    Analyze multiple images with custom prompts. Supports dynamic prompt templates with variable substitution.

    | Input | Description |

    |——-|————-|

    | images | Input images |

    | custom_prompt | Custom analysis prompt |

    | dynamic_prompt | Enable ##TAG##, ##SEX##, ##PRONOUNS##, ##WORDS## substitution |

    | fade_percentage | Blend multiple images together |

    Returns: (output, clip_l, faded_image)


    Gemini Text Only

    Display Name: APNext Gemini Text Only

    Pure text generation with Gemini models. Supports dynamic prompt templates.

    Returns: (output, clip_l)


    Gemini Next Scene

    Display Name: APNext Gemini Next Scene

    Generate cinematic transitions for visual narratives. Creates the “next scene” based on a previous prompt and current frame.

    | Input | Description |

    |——-|————-|

    | image | Current frame image |

    | original_prompt | Previous scene description |

    | focus_on | Camera Movement, Framing Evolution, Environmental Reveals, Atmospheric Shifts |

    | transition_intensity | Subtle, Moderate, or Dramatic |

    Returns: (next_scene_prompt, short_description)


    💬 OpenAI GPT Nodes

    GPT Mini Generator

    Display Name: APNext GPT Mini Generator

    Efficient text generation using GPT-4o-mini.

    | Input | Description |

    |——-|————-|

    | input_text | Text to enhance |

    | happy_talk | Enthusiastic vs professional tone |

    | compress | Enable output compression |

    | poster | Movie poster style formatting |


    GPT Vision Cloner

    Display Name: APNext GPT Vision Cloner

    Clone image styles using GPT-4o vision capabilities with custom prompts.


    GPT Custom Vision

    Display Name: APNext GPT Custom Vision

    Full custom vision analysis with GPT-4o.


    🧠 Anthropic Claude Nodes

    Claude Text Generator

    Display Name: APNext Claude Text Generator

    Text generation with Claude models (Claude 3.5 Sonnet, Claude Sonnet 4.5).

    | Input | Description |

    |——-|————-|

    | input_text | Text to process |

    | claude_model | Model selection |

    | happy_talk, compress, poster | Output style controls |

    | variation_instruction | Custom instruction for creative variations |


    Claude Vision Analyzer

    Display Name: APNext Claude Vision Analyzer

    Image analysis with Claude’s multimodal capabilities.


    ⚡ xAI Grok Nodes

    Grok Text Generator

    Display Name: APNext Grok Text Generator

    Text generation using xAI’s Grok models.


    Grok Vision Analyzer

    Display Name: APNext Grok Vision Analyzer

    Image analysis with Grok vision models.


    🚀 Groq Nodes (Ultra-Fast Inference)

    Groq Text Generator

    Display Name: APNext Groq Text Generator

    Lightning-fast text generation using Groq’s optimized infrastructure with Llama and Mixtral models.

    | Input | Description |

    |——-|————-|

    | groq_model | llama-3.3-70b-versatile, llama-3.1-8b-instant, etc. |

    | Other standard LLM inputs |


    Groq Vision Analyzer

    Display Name: APNext Groq Vision Analyzer

    Fast image analysis with Groq vision models.


    🔍 QwenVL Nodes (Local Vision)

    QwenVL Vision Analyzer

    Display Name: APNext QwenVL Vision Analyzer

    Local vision analysis using Qwen-VL models. Downloads models automatically.

    | Input | Description |

    |——-|————-|

    | images | Input images |

    | qwen_model | Qwen3-VL-4B-Instruct, etc. |

    | max_tokens | Maximum response length |

    | keep_model_loaded | Cache model in memory |


    QwenVL Vision Cloner

    Display Name: APNext QwenVL Vision Cloner

    Clone image styles locally without API calls.


    QwenVL Video Analyzer

    Display Name: APNext QwenVL Video Analyzer

    Analyze video content frame-by-frame.


    QwenVL Next Scene

    Display Name: APNext QwenVL Next Scene

    Generate cinematic scene transitions locally using QwenVL models. Takes a previous scene description and 1-5 frame images, then creates natural camera movements, framing evolution, and atmospheric shifts. Multiple frames help the model understand motion/progression.

    | Input | Description |

    |——-|————-|

    | images | 1-5 frame images (batch) |

    | original_prompt | Previous scene description |

    | qwen_model | QwenVL model to use |

    | prompt_file | Custom prompt template file |

    | custom_prompt | Override with inline prompt (optional) |

    | max_frames | Max frames to use from batch (1-5) |

    | focus_on | Camera Movement, Framing Evolution, Environmental Reveals, Atmospheric Shifts |

    | transition_intensity | Subtle, Moderate, or Dramatic |

    | keep_model_loaded | Cache model in memory |

    Returns: (next_scene_prompt, short_description)

    Custom Prompts: Create your own prompt templates in data/custom_prompts/. Use ##ORIGINAL_PROMPT## as placeholder for the previous scene description. Included templates:

  • next_scene.txt – Default detailed cinematography prompt
  • qwen_next_scene_simple.txt – Simplified version
  • qwen_next_scene_video.txt – Optimized for AI video generation

  • QwenVL Frame Prep

    Display Name: APNext QwenVL Frame Prep

    Utility node to prepare multiple images for QwenVL Next Scene. Accepts up to 5 individual images or a batch, scales them to max dimensions, and outputs a batched tensor.

    | Input | Description |

    |——-|————-|

    | max_width | Maximum width (default 1024) |

    | max_height | Maximum height (default 1024) |

    | image_1image_5 | Individual image inputs |

    | image_batch | Pre-batched images (optional) |

    Returns: (images, frame_count)


    QwenVL Z-Image Vision

    Display Name: APNext QwenVL Z-Image Vision

    Analyzes images and outputs in Z-Image TurnBuilder chat format with <|im_start|>/<|im_end|> tokens.


    🦙 Ollama Nodes (Local LLM)

    Ollama Node

    Display Name: APNext OllamaNode

    Local LLM inference using Ollama. Supports any model installed in your Ollama instance.

    | Input | Description |

    |——-|————-|

    | input_text | Text to process |

    | model_name | Any Ollama model (llama3, mistral, etc.) |

    | happy_talk, compress | Output controls |


    Ollama Vision

    Display Name: APNext OllamaVision

    Local vision analysis with Ollama multimodal models (llava, bakllava, etc.).


    📸 MiniCPM Nodes (Local Vision)

    MiniCPM Image Node

    Display Name: APNext MiniCPM Image

    Image understanding with MiniCPM-V 4.5 (OpenBMB). Supports thinking mode for complex reasoning.

    | Input | Description |

    |——-|————-|

    | images | Input images |

    | question | Question about the image |

    | enable_thinking | Deep reasoning mode |

    | precision | bfloat16 or float16 |

    | unload_after_inference | Free memory after use |


    MiniCPM Video Node

    Display Name: APNext MiniCPM Video

    Video understanding and analysis.


    🔬 Phi Nodes (Microsoft Vision)

    Phi Model Loader

    Display Name: APNext Phi Model Loader

    Load Microsoft Phi-3.5-vision-instruct model.

    | Input | Description |

    |——-|————-|

    | model_version | Phi-3.5-vision-instruct |

    | image_crops | 4 or 16 crops for detail |

    | attention_mechanism | flash_attention_2, sdpa, or eager |


    Phi Model Inference / Custom Inference

    Display Name: APNext Phi Model Inference

    Run inference with loaded Phi model.


    🎨 Image FX Nodes

    Professional image effects using optimized tensor operations.

    APNext Bloom FX

    Creates a bloom/glow effect on bright areas.

    | Input | Description |

    |——-|————-|

    | intensity | Bloom strength (0-5) |

    | threshold | Brightness threshold (0-1) |

    | blur_radius | Glow spread (1-50) |

    | blend_mode | additive, screen, or overlay |


    APNext Color Grading FX

    Professional color grading with LUT support or manual controls.

    | Input | Description |

    |——-|————-|

    | method | manual or lut_file |

    | lut_file | .cube, .3dl, or image LUT |

    | exposure | -3 to +3 stops |

    | contrast, saturation | Standard adjustments |

    | highlights, shadows | Tone controls |

    | temperature, tint | White balance |

    Supported LUT Formats: .cube (Adobe/Blackmagic), .3dl (Autodesk/Flame), Image LUTs (.png, .jpg)


    APNext Sharpen FX

    Intelligent image sharpening.


    APNext Noise FX

    Add film grain and noise effects.


    APNext Rough FX

    Add texture and roughness.


    APNext Cross Processing FX

    Film cross-processing color effects.


    APNext Split Toning FX

    Separate color toning for highlights and shadows.


    APNext HDR Tone Mapping FX

    HDR-style tone mapping.


    APNext Glitch Art FX

    Digital glitch and databending effects.


    APNext Film Halation FX

    Classic film halation (light bleeding) effect.


    📐 Latent Generators

    APNext Latent Generator

    Display Name: APNext Latent Generator

    Generate latent tensors with intelligent dimension calculation.

    | Input | Description |

    |——-|————-|

    | width, height | Base dimensions (0 = auto-calculate) |

    | megapixel_scale | Target megapixels (0.1-2.0) |

    | aspect_ratio | 1:1, 3:2, 4:3, 16:9, 21:9 |

    | is_portrait | Portrait orientation |

    Returns: (LATENT, width, height)


    PGSD3 Latent Generator

    Display Name: APNext PGSD3LatentGenerator

    Optimized latent generation for Stable Diffusion 3 pipelines.


    🎲 Prompt Generators

    Auto Prompter

    Display Name: Auto Prompter

    Generate random prompts from extensive category databases.

    | Input | Description |

    |——-|————-|

    | subject | Main subject (can include LoRA triggers) |

    | custom | Prefix text for styling |

    | artform | Photography, digital art, etc. |

    | Various category selections | Random or specific choices |


    APNext Node

    Display Name: APNext Node

    Advanced prompt building with category-based enhancements.

    Overview

    The system includes numerous nodes that can be chained together to create complex workflows:

    Supports 24 main categories with subcategories:

  • Architecture: styles, buildings, interiors, materials
  • Art: painting, sculpture, techniques, palettes
  • Artist: concept artists, illustrators, painters
  • Character: anime, fantasy, sci-fi, superheroes
  • Cinematic: directors, genres, effects, color grading
  • Fashion: designers, outfits, accessories
  • Feelings: emotional modifiers
  • Geography: countries, nationalities
  • Human: jobs, hobbies, groups
  • Interaction: individual, couple, group, crowd interactions
  • Keywords: modifiers, genres, trending terms
  • People: archetypes, body types, expressions
  • Photography: cameras, lenses, lighting, film types
  • Plots: action, romance, horror, sci-fi scenarios
  • Poses: portrait and action poses
  • Scene: weather, textures, environments
  • Science: astronomy, mathematics, medical
  • Stuff: seasonal objects, gadgets, fantasy items
  • Time: eras, decades, centuries
  • Typography: fonts, word art styles
  • Vehicle: cars, classic cars, vehicle types
  • Video Game: games, engines, actions

  • 🔧 Utility Nodes

    String Merger

    Display Name: APNext String Merger

    Combine multiple strings with separators.


    Flexible String Merger

    Display Name: APNext Flexible String Merger

    Advanced string combining with custom formatting.


    Sentence Mixer

    Display Name: APNext Sentence Mixer

    Shuffle and mix sentences from multiple inputs for creative variations.


    Custom Prompt Loader

    Display Name: APNext Custom Prompts

    Load prompt templates from the data/custom_prompts/ directory.

    Included templates:

  • promptcreator.txt – Full creative prompt generation
  • image_analyze.txt – Image analysis prompts
  • gemini_video.txt – Video generation prompts
  • cloner.txt – Style cloning prompts
  • Various LoRA-specific templates (ohwx, t5xxl, etc.)

  • Local Random Prompt

    Display Name: APNext Local random prompt

    Load random prompts from local text files.


    Random Integer Generator

    Display Name: APNext Random Integer Generator

    Generate random integers with min/max range.


    📁 Adding Custom Categories

    Create your own categories for APNextNode:

  • Create a folder in data/next/ (e.g., data/next/mycategory/)
  • Add JSON files for each field
  • Simple Format

    ["item1", "item2", "item3"]

    Advanced Format

    {
      "preprompt": "with",
      "separator": " and ",
      "endprompt": "visual effects",
      "items": ["motion blur", "lens flare", "particle effects"],
      "attributes": {
        "motion blur": ["dynamic", "cinematic"],
        "lens flare": ["bright", "atmospheric"]
      }
    }


    📝 Custom Prompt Templates

    Create your own prompt templates for use with the Custom Prompt Loader node.

    Location

    Place .txt files in: data/custom_prompts/

    Creating a Template

    Templates are plain text files containing instructions for LLM nodes. They support dynamic variable substitution:

    | Variable | Description |

    |———-|————-|

    | ##TAG## | Replaced with the tag input (e.g., “ohwx man”) |

    | ##SEX## | Replaced with the sex input (e.g., “male”, “female”) |

    | ##PRONOUNS## | Replaced with pronouns (e.g., “him, his”) |

    | ##WORDS## | Replaced with target word count |

    Example Template

    Create a file data/custom_prompts/my_style.txt:

    As a professional art critic, describe the provided image in detail.
    Focus on creating a cohesive scene as if describing a movie still.
    
    If the subject is ##TAG##, use ##PRONOUNS## pronouns appropriately.
    The subject is ##SEX##.
    
    Include:
    - Main subject description with clothing, accessories, position
    - Setting and environment details
    - Lighting type, direction, and atmosphere
    - Color palette and emotional tone
    - Camera angle and composition
    
    Output approximately ##WORDS## words.
    Do not use JSON format. Provide a single cohesive paragraph.

    Included Templates

    | Template | Purpose |

    |———-|———|

    | promptcreator.txt | Detailed image analysis (~150 words) |

    | promptcreator_small.txt | Concise image analysis |

    | image_analyze.txt | General image description |

    | cloner.txt | Style cloning prompts |

    | gemini_video.txt | Video generation prompts |

    | gemini_ohwx.txt | LoRA trigger-aware prompts |

    | t5xxl.txt | T5-XXL optimized prompts |

    | ltxv.txt | LTX Video model prompts |

    | next_scene.txt | Cinematic scene transitions |


    ⚙️ Configuring LLM Models

    Customize available models by editing JSON configuration files in the data/ folder.

    Model Configuration Files

    | File | Provider | Description |

    |——|———-|————-|

    | gemini_models.json | Google Gemini | Gemini model list |

    | gpt_models.json | OpenAI | GPT model list |

    | claude_models.json | Anthropic | Claude model list |

    | grok_models.json | xAI | Grok model list |

    | groq_models.json | Groq | Groq model list (text + vision) |

    | qwenvl_models.json | QwenVL | Local Qwen vision models |

    QwenVL Models – Adding Private/Custom Models

    QwenVL nodes support loading additional models from private configuration files. This allows you to add custom or uncensored models without modifying the main configuration.

    How to add private models:

  • Create a JSON file in data/ with a name matching private_*qwenvl*.json
  • Examples: private_qwenvl_models.json, private_uncensored.qwenvl_models.json
  • Use the same format as qwenvl_models.json:
  • {
        "models": [
            "huihui-ai/Huihui-Qwen3-VL-4B-Instruct-abliterated",
            "huihui-ai/Huihui-Qwen3-VL-8B-Instruct-abliterated",
            "another-namespace/custom-model"
        ]
    }

  • Restart ComfyUI – the models will appear in the QwenVL node dropdowns
  • Notes:

  • Private files are loaded in addition to the main qwenvl_models.json
  • Duplicate models are automatically filtered out
  • Supports full HuggingFace repo paths (namespace/model-name)
  • Models are downloaded to ComfyUI/models/LLM/Qwen-VL/ on first use
  • Basic Format

    Most model files use a simple array format:

    {
        "models": [
            "model-name-1",
            "model-name-2",
            "model-name-3"
        ]
    }

    Example: Adding New Gemini Models

    Edit data/gemini_models.json:

    {
        "models": [
            "gemini-2.5-pro",
            "gemini-2.5-flash",
            "gemini-flash-latest",
            "gemini-flash-lite-latest",
            "gemini-2.5-flash-lite",
            "gemini-exp-1206"
        ]
    }

    Example: Adding New Claude Models

    Edit data/claude_models.json:

    {
        "models": [
            "claude-sonnet-4.5",
            "claude-sonnet-4",
            "claude-sonnet-3.7",
            "claude-opus-4.1",
            "claude-opus-4",
            "claude-haiku-3.5",
            "claude-haiku-3"
        ]
    }

    Groq Models (Advanced Format)

    Groq supports separate text and vision model lists:

    {
        "text_models": [
            "llama-3.3-70b-versatile",
            "llama-3.1-8b-instant",
            "groq/compound",
            "qwen/qwen3-32b"
        ],
        "vision_models": [
            "meta-llama/llama-4-scout-17b-16e-instruct",
            "meta-llama/llama-4-maverick-17b-128e-instruct"
        ],
        "note": "Edit this file to add/remove models"
    }

    Notes

  • Restart ComfyUI after editing model configuration files
  • For Groq, the system will first try to fetch models from the API, then fall back to the JSON file
  • Model names must match exactly what the provider’s API expects
  • Invalid model names will cause API errors at runtime

  • 🖼️ Example Workflows

    Example workflows are available in the examples/ directory:

  • APNext workflows: examples/flux/apnext/
  • Florence2 local: examples/flux/florence2/
  • GPT-4o Vision: examples/flux/gpt-4o_vision/
  • Ollama local: examples/flux/ollama_local_llm/
  • MiniCPM: examples/minicpm/

  • 📋 Requirements

    Pillow>=10.4.0
    requests>=2.32.5
    openai>=1.44.0
    blend-modes>=2.1.0
    huggingface_hub>=0.34.0
    color_matcher>=0.5.0
    chardet>=5.2.0
    google-generativeai>=0.7.2
    anthropic
    transformers>=4.40.0
    decord>=0.6.0
    scipy>=1.10.0
    tqdm>=4.67.1


    🔄 Model Support Matrix

    | Provider | Text | Vision | Video | Local |

    |———-|——|——–|——-|——-|

    | OpenAI GPT | ✅ | ✅ | ❌ | ❌ |

    | Google Gemini | ✅ | ✅ | ✅ | ❌ |

    | Anthropic Claude | ✅ | ✅ | ❌ | ❌ |

    | xAI Grok | ✅ | ✅ | ❌ | ❌ |

    | Groq | ✅ | ✅ | ❌ | ❌ |

    | QwenVL | ✅ | ✅ | ✅ | ✅ |

    | Ollama | ✅ | ✅ | ❌ | ✅ |

    | MiniCPM | ✅ | ✅ | ✅ | ✅ |

    | Phi-3.5 | ✅ | ✅ | ❌ | ✅ |


    📝 License

    MIT License


    🙏 Acknowledgments

    Built for the ComfyUI community. Special thanks to all contributors and users providing feedback.