accelerate torch torchvision Pillow numpy omegaconf decord einops matplotlib diffusers scipy av imageio opencv_contrib_python transformers huggingface_hub onnxruntime

ComfyUI supports over lihxxx/DisPose, which generates a new video with a reference video as poses and a reference image as everything else.
_Assuming that you are under your ComfyUI root directory_
git clone https://github.com/bombax-xiaoice/ComfyUI-DisPose custom_nodes/ComfyUI-DisPose
pip install -r custom_nodes/ComfyUI-DisPose/requirements.txt
_You can download model files from huggingface or its mirror site beforehand, or just wait for the first run of (Down)Loader Disposeto download them_
mkdir custom_nodes/ComfyUI-DisPose/pretrained_weights
cd custom_nodes/ComfyUI-DisPose/pretrained_weights
git lfs clone https://huggingface.co/stable-diffusion-v1-5/stable-diffusion-v1-5
git lfs clone https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt-1-1
git lfs clone https://huggingface.co/yzd-v/DWPose
wget https://huggingface.co/lihxxx/DisPose/blob/main/DisPose.pth
wget https://huggingface.co/tencent/MimicMotion/blob/main/MimicMotion_1-1.pth
wget https://huggingface.co/MyNiuuu/MOFA-Video-Hybrid/resolve/main/models/cmp/experiments/semiauto_annot/resnet50_vip%2Bmpii_liteflow/checkpoints/ckpt_iter_42000.pth.tar -O custom_nodes/ComfyUI-DisPose/mimicmotion/modules/cmp/experiments/semiauto_annot/resnet50_vip+mpii_liteflow/checkpoints/ckpt_iter_42000.pth.tar
Drag the following image into comfyui, or click Load for custom_nodes/ComfyUI-DisPose/dispose-comfy-example.json
Results run under comfy
https://github.com/user-attachments/assets/0e36a3c1-f0c4-4d2b-afe9-b3dc219d6e43
DisPose had done some very good engineering magics to allow more precise controls on AIGC video clips, at least better than any other open source solution available for now. Unfortunately, the base models DisPose uses are SD1.5 and SVDXT1.1, which are years old and definitely not the state-of-art solutions even within the open source world. Therefore, defects in details, including face, hairs, clothes and physics are still here and there in its output videos. On way is to wait for the DisPose team to deliver new versions on top of better base models. Before that, another way is to repaint the DisPose output (keeping all frames or one out of every N frames as input reference) with some other state-of-arts models such as Allegro or NVidia Cosmos. If face identity is critical, one may also need to repaint face by every frame using LORA+ControlNet+FaceDetailer. (There are many mature comfy pipelines available online)