ComfyUI-HunyuanVideo-Avatar

★ 28

多模态视频生成情绪可控多角色对话ComfyUI 节点

在 ComfyUI 中集成 HunyuanVideo-Avatar 模型，基于 MM-DiT 同步生成可控情绪、多角色对话的动态视频。

💡 在 ComfyUI 中生成情绪可控、多角色的对话短视频。

🍴 6 Forks💻 Python🔄 2025-05-29

🔗 GitHub 原文

📦

网盘下载

复制链接后前往夸克网盘下载

https://pan.quark.cn/s/a9fb3a59e10c

📦 requirements.txt

opencv-python==4.9.0.80
diffusers==0.33.0
transformers==4.45.1
accelerate==1.1.1
pandas==2.0.3
numpy==1.24.4
einops==0.7.0
tqdm==4.66.2
loguru==0.7.2
imageio==2.34.0
imageio-ffmpeg==0.5.1
safetensors==0.4.3
gradio==4.42.0
fastapi==0.115.12
uvicorn==0.34.2
decord==0.6.0
librosa==0.11.0
scikit-video==1.1.11
ffmpeg

📄 README

ComfyUI-HunyuanVideo-Avatar

ComfyUI-HunyuanVideo-Avatar is now available in ComfyUI, HunyuanVideo-Avatar is a multimodal diffusion transformer (MM-DiT)-based model capable of simultaneously generating dynamic, emotion-controllable, and multi-character dialogue videos.

Installation

Make sure you have ComfyUI installed

Clone this repository into your ComfyUI’s custom_nodes directory:

cd ComfyUI/custom_nodes
git clone https://github.com/Yuan-ManX/ComfyUI-HunyuanVideo-Avatar.git

Install dependencies:

cd ComfyUI-HunyuanVideo-Avatar
pip install -r requirements.txt

Installation Guide for Linux

We recommend CUDA versions 12.4 or 11.8 for the manual installation.

Conda’s installation instructions are available here.


# Install PyTorch and other dependencies using conda
# For CUDA 11.8
conda install pytorch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 pytorch-cuda=11.8 -c pytorch -c nvidia
# For CUDA 12.4
conda install pytorch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 pytorch-cuda=12.4 -c pytorch -c nvidia

# Install pip dependencies
python -m pip install -r requirements.txt

# Install flash attention v2 for acceleration (requires CUDA 11.8 or above)
python -m pip install ninja
python -m pip install git+https://github.com/Dao-AILab/flash-attention.git@v2.6.3

In case of running into float point exception(core dump) on the specific GPU type, you may try the following solutions:

# Option 1: Making sure you have installed CUDA 12.4, CUBLAS>=12.4.5.8, and CUDNN>=9.00 (or simply using our CUDA 12 docker image).
pip install nvidia-cublas-cu12==12.4.5.8
export LD_LIBRARY_PATH=/opt/conda/lib/python3.8/site-packages/nvidia/cublas/lib/

# Option 2: Forcing to explicitly use the CUDA 11.8 compiled version of Pytorch and all the other packages
pip uninstall -r requirements.txt  # uninstall all packages
pip install torch==2.4.0 --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt
pip install ninja
pip install git+https://github.com/Dao-AILab/flash-attention.git@v2.6.3

Additionally, you can also use HunyuanVideo Docker image. Use the following command to pull and run the docker image.

# For CUDA 12.4 (updated to avoid float point exception)
docker pull hunyuanvideo/hunyuanvideo:cuda_12
docker run -itd --gpus all --init --net=host --uts=host --ipc=host --name hunyuanvideo --security-opt=seccomp=unconfined --ulimit=stack=67108864 --ulimit=memlock=-1 --privileged hunyuanvideo/hunyuanvideo:cuda_12
pip install gradio==3.39.0 diffusers==0.33.0 transformers==4.41.2

# For CUDA 11.8
docker pull hunyuanvideo/hunyuanvideo:cuda_11
docker run -itd --gpus all --init --net=host --uts=host --ipc=host --name hunyuanvideo --security-opt=seccomp=unconfined --ulimit=stack=67108864 --ulimit=memlock=-1 --privileged hunyuanvideo/hunyuanvideo:cuda_11
pip install gradio==3.39.0 diffusers==0.33.0 transformers==4.41.2

Model

Download Pretrained Models

HunyuanVideo-Avatar Pretrained Models

All models are stored in ComfyUI/models/HunyuanVideo-Avatar/weights by default, and the file structure is as follows

HunyuanVideo-Avatar
  ├──weights
  │  ├──ckpts
  │  │  ├──README.md
  │  │  ├──hunyuan-video-t2v-720p
  │  │  │  ├──transformers
  │  │  │  │  ├──mp_rank_00_model_states.pt
  │  │  │  │  ├──mp_rank_00_model_states_fp8.pt
  │  │  │  │  ├──mp_rank_00_model_states_fp8_map.pt
  │  │  │  ├──vae
  │  │  │  │  ├──pytorch_model.pt
  │  │  │  │  ├──config.json
  │  │  ├──llava_llama_image
  │  │  │  ├──model-00001-of-00004.safatensors
  │  │  │  ├──model-00002-of-00004.safatensors
  │  │  │  ├──model-00003-of-00004.safatensors
  │  │  │  ├──model-00004-of-00004.safatensors
  │  │  │  ├──...
  │  │  ├──text_encoder_2
  │  │  ├──whisper-tiny
  │  │  ├──det_align
  │  │  ├──...

Download HunyuanVideo-Avatar model

To download the HunyuanCustom model, first install the huggingface-cli. (Detailed instructions are available here.)

python -m pip install "huggingface_hub[cli]"

Then download the model using the following commands:

# Switch to the directory named 'HunyuanVideo-Avatar/weights'
cd HunyuanVideo-Avatar/weights
# Use the huggingface-cli tool to download HunyuanVideo-Avatar model in HunyuanVideo-Avatar/weights dir.
# The download time may vary from 10 minutes to 1 hour depending on network conditions.
huggingface-cli download tencent/HunyuanVideo-Avatar --local-dir ./

Requirements

An NVIDIA GPU with CUDA support is required.

The model is tested on a machine with 8GPUs.

Minimum: The minimum GPU memory required is 24GB for 704px768px129f but very slow.

Recommended: We recommend using a GPU with 96GB of memory for better generation quality.

Tips: If OOM occurs when using GPU with 80GB of memory, try to reduce the image resolution.

Tested operating system: Linux