ComfyUI_HunyuanAvatar_Sm

★ 81

音频驱动人像动画多角色支持高保真重建大显存需求

ComfyUI_HunyuanAvatar_Sm 节点在 ComfyUI 中实现基于音频的高保真多角色人像动画，支持人脸掩码与双人测试，并修复了 CPU 卸载相关错误，适合大显存环境。

💡 在 ComfyUI 中根据音频为多角色生成高保真面部与动作动画

🍴 5 Forks💻 Python🔄 2025-06-24

🔗 GitHub 原文

📦

网盘下载

复制链接后前往夸克网盘下载

https://pan.quark.cn/s/c1eafc754fbb

📦 requirements.txt

opencv-python
diffusers==0.33.0
transformers==4.45.1
accelerate
pandas
numpy
einops
tqdm
loguru
imageio
imageio-ffmpeg
safetensors
#gradio==4.42.0
#fastapi==0.115.12
#uvicorn==0.34.2
decord
librosa
scikit-video
ffmpeg
flash-attn
omegaconf

📄 README

ComfyUI_HunyuanAvatar_Sm

HunyuanVideo-Avatar: High-Fidelity Audio-Driven Human Animation for Multiple Characters,try it in comfyUI ,if your VRAM >12 or 10G

TIPS:

fix disable cpu_offload causes error. 修复关闭cpu卸载引发出错的bug

try dual role ,face size to contol input image face mask #测试双人代码，不一定有效 ,face size参数用是获取垫图的人脸范围，如果脸小，就用小数值，默认是3.0

1.Installation

In the ./ComfyUI /custom_node directory, run the following:

git clone https://github.com/smthemex/ComfyUI_HunyuanAvatar_Sm.git

2.requirements

pip install -r requirements.txt

3 models

download files from tencent/HunyuanVideo-Avatar

├── ComfyUI/models/HunyuanAvatar/
|   ├── det_align/
|         ├──detface.pt
|   ├── llava_llama_image/
|         ├──config.json
|         ├── ...所有json文件以及所有safetensors模型
|   ├──text_encoder_2/
|         ├──config.json
|         ├── ... 所有json文件以及model.safetensors模型
|   ├──vae/
|         ├──config.json
|         ├── pytorch_model.pt
|   ├──whisper-tiny/
|         ├──config.json
|         ├── ... 所有json文件以及model.safetensors模型
|   ├── mp_rank_00_model_states_fp8_map.pt #104K if use fp8  如果用fp8则下载
|   ├── mp_rank_00_model_states_fp8.pt.pt #24.9G  if use fp8  如果用fp8则下载
|   ├──mp_rank_00_model_states.pt

4 example

🔗 BibTeX

If you find HunyuanVideo-Avatar useful for your research and applications, please cite using this BibTeX:

@misc{hu2025HunyuanVideo-Avatar,
      title={HunyuanVideo-Avatar: High-Fidelity Audio-Driven Human Animation for Multiple Characters}, 
      author={Yi Chen and Sen Liang and Zixiang Zhou and Ziyao Huang and Yifeng Ma and Junshu Tang and Qin Lin and Yuan Zhou and Qinglin Lu},
      year={2025},
      eprint={2505.20156},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/pdf/2505.20156}, 
}

Acknowledgements

We would like to thank the contributors to the HunyuanVideo, SD3, FLUX, Llama, LLaVA, Xtuner, diffusers and HuggingFace repositories, for their open research and exploration.