mediapipe[vision]==0.10.14 mlflow moviepy==1.0.3 omegaconf onnxruntime transformers av regex scipy



If you find this project helpful, consider buying me a coffee:
[](https://buymeacoffee.com/shmuelronen)
A ComfyUI custom node wrapper for JoyHallo – One-Shot Audio-Driven Talking Head Generation.
cd ComfyUI/custom_nodes
git clone https://github.com/ShmuelRonen/ComfyUI-JoyHallo_wrapper
cd ComfyUI-JoyHallo_wrapper
pip install -r requirements.txt
models/JOY/HALLO/If automatic download fails, you can manually install models:
git lfs install
ComfyUI/models/JOY/HALLO/:# Base model
git clone https://huggingface.co/fudan-generative-ai/hallo pretrained_models
# Wav2Vec model
git clone https://huggingface.co/TencentGameMate/chinese-wav2vec2-base
# JoyHallo model
git clone https://huggingface.co/jdh-algo/JoyHallo-v1 pretrained_models/joyhallo
Final structure should be:
ComfyUI/models/JOY/
└── HALLO/
├── stable-diffusion-v1-5/
├── chinese-wav2vec2-base/
└── JoyHallo-v1/
pytorch version: 2.4.0+cu121
xformers version: 0.0.27.post2
Python version: 3.12.7
Inputs:
Outputs:
graph LR
A[Load Audio] --> C[JoyHallo_wrapper]
B[Load Image] --> C
C --> D[Video Output]
This is a wrapper for JoyHallo by jdh-algo, following their original license.
Key components:
@article{jin2024joyhallo,
title={JoyHallo: One-Shot Arbitrary-Face Audio-Driven Talking Head Generation},
author={Junhao Jin and Tong Yu and Boyuan Jiang and Zhendong Mao and Yemin Shi},
year={2024},
journal={arXiv preprint arXiv:2401.17221},
}