ComfyUI_Hedra

ComfyUI_Hedra
★ 4

角色视频生成音频驱动动画背景协同动画ComfyUI节点
在ComfyUI中通过Hedra Character-3 API把图片与音频生成高拟真会说话的头像视频,支持背景协同动画、多分辨率与帧提取
💡 将人物图片与音频一键生成可编辑的会说话视频素材
🍴 3 Forks💻 Python🔄 2025-05-04
📦
网盘下载
复制链接后前往夸克网盘下载
https://pan.quark.cn/s/8f9eee5e2cdb
📦 requirements.txt
scipy>=1.10.0
opencv-python>=4.8.0
image
image
📄 README

ComfyUI Hedra Node

A custom node for ComfyUI that integrates with Hedra‘s Character-3 API to generate talking avatar videos from images and audio.

https://github.com/user-attachments/assets/13150d25-df1f-4bb6-8f0c-ff5ca599575e

Features

  • Generate talking avatar videos using Hedra’s advanced Character-3 technology
  • Intelligent background animation – background objects move in coordination with the main character
  • Support for multiple aspect ratios (16:9, 9:16, 1:1)
  • Multiple resolution options (540p, 720p)
  • Auto duration or custom duration settings
  • Custom emotion and gesture prompts
  • Video frame extraction for ComfyUI pipeline integration
  • Debug mode for troubleshooting
  • API connection testing utilities
  • Pricing

    Hedra Character-3 API Pricing:

  • ~3.5 to ~7 credits per second of video
  • Actual credit usage depends on video complexity and resolution
  • Example: A 30-second video costs approximately 105-210 credits
  • What’s New in Character-3

    Character-3 introduces several advanced features:

  • An othntic image to video character Animation driven by audio and music
  • Coordinated Background Animation: Background elements and objects move naturally in sync with the character’s movements
  • Enhanced Spatial Awareness: The AI understands the 3D space and creates more realistic depth and movement
  • Improved Gesture Recognition: Better interpretation of prompts for natural hand movements and body language
  • Advanced Emotion Mapping: More nuanced facial expressions that match the audio’s emotional content
  • Scene Coherence: Maintains consistency between character movements and environmental elements
  • Installation

  • Clone this repository into your ComfyUI custom nodes folder:
  • cd ComfyUI/custom_nodes
    git clone https://github.com/ShmuelRonen/ComfyUI_Hedra.git

  • Install the required dependencies:
  • cd ComfyUI_Hedra
    pip install -r requirements.txt

  • Restart ComfyUI
  • Getting Started

    1. Obtain API Key

  • Visit Hedra API Profile
  • Sign up or log in to your Hedra account
  • Subscribe to a paid plan (Creator tier or higher required for API access)
  • Navigate to your API profile page
  • Copy your API key (it should start with sk_h)
  • 2. Configure the Node

  • After installation, a config.json file will be created in the node folder
  • Open config.json and replace "your_api_key_here" with your actual API key:
  • {
        "api_key": "your_api_key_here"
    }

    Usage

    Available Nodes

    1. Hedra Image to Video

    The main node for generating talking avatar videos using Character-3.

    Inputs:

  • image: Input portrait image (Start frame) – can include background elements
  • audio: Audio file for lip-sync (Audio script)
  • prompt (optional): Text description for emotions, gestures, and scene dynamics
  • aspect_ratio: Choose from 16:9, 9:16, or 1:1
  • resolution: 540p or 720p
  • use_test_mode: Set to true for testing without API calls
  • debug_mode: Enable detailed logging
  • Outputs:

  • images: Video frames as a batch
  • audio: Original audio (pass-through)
  • frame_count: Number of frames extracted
  • video_url: URL of the generated video
  • fps: Frames per second (typically 24)
  • Prompt Examples for Better Results

    To take advantage of Character-3’s advanced capabilities, try these prompt patterns:

    Basic Emotion & Gesture:

    "smiling warmly, gesturing with hands while speaking"

    With Background Interaction:

    "speaking enthusiastically with hands, background gently swaying with movement"

    Complex Scene Dynamics:

    "passionate speech with emphatic gestures, environment responds to emotional intensity"

    Specific Background Elements:

    "confident presentation, curtains flutter as character moves, plants sway subtly"

    Cost Examples

    | Audio Duration | Approximate Credits | Estimated Cost Range |

    |—————-|——————–|——————–|

    | 10 seconds | 35-70 credits | ~3.5-7 credits/sec |

    | 30 seconds | 105-210 credits | ~3.5-7 credits/sec |

    | 60 seconds | 210-420 credits | ~3.5-7 credits/sec |

    *Note: Actual costs may vary based on video complexity and additional features used.*

    Workflow Example

  • Load an image using Load Image node (include background elements for best effect)
  • Load audio using an audio loader node
  • Connect both to the Hedra Image to Video node
  • Set your desired aspect ratio and resolution
  • Add a detailed prompt describing both character and scene dynamics
  • Connect the output frames to a Video Combine node or save them
  • Character-3 Advanced Features

    The Character-3 model from Hedra offers:

  • Scene Understanding: AI comprehends the relationship between character and environment
  • Dynamic Backgrounds: Background elements move naturally with character actions
  • Depth Perception: Creates realistic 3D movement within 2D images
  • Motion Coherence: Ensures all elements move in physically plausible ways
  • Adaptive Animation: Adjusts movement intensity based on audio energy and emotion
  • Best Practices for Background Animation

  • Image Selection: Choose images with distinct background elements (curtains, plants, furniture)
  • Prompt Clarity: Describe how you want the background to react to the character
  • Audio Matching: Background movement intensity matches audio energy levels
  • Scene Composition: Leave space around the character for natural movement
  • Important Notes

  • API Credits: Each video generation consumes credits from your Hedra account
  • Processing Time: Video generation typically takes 2-5 minutes
  • Audio Length: Longer audio files will consume more credits (3.5-7 credits per second)
  • Image Requirements: Best results with clear face portraits and visible background elements
  • Output Format: Videos are generated at 24 FPS
  • Troubleshooting

  • Background Not Animating:
  • Ensure your prompt mentions background movement
  • Use images with distinct background elements
  • Try more specific scene descriptions
  • API Key Issues:
  • Ensure your API key is correctly set in config.json
  • Verify you have an active paid subscription
  • Check that your API key starts with sk_h
  • Generation Failures:
  • Enable debug_mode for detailed error messages
  • Verify your audio format is supported (WAV recommended)
  • Ensure your image contains a clear face
  • API Endpoints

    The node uses the following Hedra API endpoints:

  • Base URL: https://api.hedra.com/web-app/public
  • /models – Get available AI models
  • /assets – Create and upload assets
  • /generations – Create and monitor video generations
  • Credits

    This node was developed to integrate Hedra’s powerful Character-3 technology with ComfyUI, enabling seamless talking avatar generation with advanced background animation in visual workflows.

    License

    This project is licensed under the MIT License – see the LICENSE file for details.

    Disclaimer

    This is an unofficial integration and is not affiliated with or endorsed by Hedra. Use of the Hedra API is subject to their terms of service and pricing. API costs are charged by Hedra and not by this node’s developer.