ComfyUI_HunyuanAvatar_Sm

ComfyUI_HunyuanAvatar_Sm
★ 81

音频驱动人像动画多角色支持高保真重建大显存需求
ComfyUI_HunyuanAvatar_Sm 节点在 ComfyUI 中实现基于音频的高保真多角色人像动画,支持人脸掩码与双人测试,并修复了 CPU 卸载相关错误,适合大显存环境。
💡 在 ComfyUI 中根据音频为多角色生成高保真面部与动作动画
🍴 5 Forks💻 Python🔄 2025-06-24
📦
网盘下载
复制链接后前往夸克网盘下载
https://pan.quark.cn/s/c1eafc754fbb
📦 requirements.txt
opencv-python
diffusers==0.33.0
transformers==4.45.1
accelerate
pandas
numpy
einops
tqdm
loguru
imageio
imageio-ffmpeg
safetensors
#gradio==4.42.0
#fastapi==0.115.12
#uvicorn==0.34.2
decord
librosa
scikit-video
ffmpeg
flash-attn
omegaconf
📄 README

ComfyUI_HunyuanAvatar_Sm

  • HunyuanVideo-Avatar: High-Fidelity Audio-Driven Human Animation for Multiple Characters,try it in comfyUI ,if your VRAM >12 or 10G
  • TIPS:


  • fix disable cpu_offload causes error. 修复关闭cpu卸载引发出错的bug
  • try dual role ,face size to contol input image face mask #测试双人代码,不一定有效 ,face size参数用是获取垫图的人脸范围,如果脸小,就用小数值,默认是3.0
  • 1.Installation


    In the ./ComfyUI /custom_node directory, run the following:

    git clone https://github.com/smthemex/ComfyUI_HunyuanAvatar_Sm.git

    2.requirements


    pip install -r requirements.txt

    3 models


  • download files from tencent/HunyuanVideo-Avatar
  • ├── ComfyUI/models/HunyuanAvatar/
    |   ├── det_align/
    |         ├──detface.pt
    |   ├── llava_llama_image/
    |         ├──config.json
    |         ├── ...所有json文件以及所有safetensors模型
    |   ├──text_encoder_2/
    |         ├──config.json
    |         ├── ... 所有json文件以及model.safetensors模型
    |   ├──vae/
    |         ├──config.json
    |         ├── pytorch_model.pt
    |   ├──whisper-tiny/
    |         ├──config.json
    |         ├── ... 所有json文件以及model.safetensors模型
    |   ├── mp_rank_00_model_states_fp8_map.pt #104K if use fp8  如果用fp8则下载
    |   ├── mp_rank_00_model_states_fp8.pt.pt #24.9G  if use fp8  如果用fp8则下载
    |   ├──mp_rank_00_model_states.pt

    4 example


    🔗 BibTeX

    If you find HunyuanVideo-Avatar useful for your research and applications, please cite using this BibTeX:

    @misc{hu2025HunyuanVideo-Avatar,
          title={HunyuanVideo-Avatar: High-Fidelity Audio-Driven Human Animation for Multiple Characters}, 
          author={Yi Chen and Sen Liang and Zixiang Zhou and Ziyao Huang and Yifeng Ma and Junshu Tang and Qin Lin and Yuan Zhou and Qinglin Lu},
          year={2025},
          eprint={2505.20156},
          archivePrefix={arXiv},
          primaryClass={cs.CV},
          url={https://arxiv.org/pdf/2505.20156}, 
    }

    Acknowledgements

    We would like to thank the contributors to the HunyuanVideo, SD3, FLUX, Llama, LLaVA, Xtuner, diffusers and HuggingFace repositories, for their open research and exploration.