accelerate==0.33.0 diffusers==0.28.0 numpy==1.24.4 torch==2.4.1 tqdm==4.66.2 transformers==4.40.1 xformers==0.0.28.post1 einops==0.7.0 decord==0.6.0 sentencepiece==0.1.99 imageio imageio-ffmpeg ftfy bs4


ComfyUI supports over rhymes-ai/Allegro, which uses text prompt to generate short video in relatively high quality, especially comparing to other open source solutions available for now.
_Assuming that you are under your ComfyUI root directory_
git clone https://github.com/bombax-xiaoice/ComfyUI-Allegro custom_nodes/ComfyUI-Allegro
pip install -r custom_nodes/ComfyUI-Allegro/requirements.txt
_You can download the model file from huggingface or its mirror site beforehand, or just wait for the first run of (Down)Load Allegro Model or (Down)Load Allegro TextImage2Video Model to download it_
git lfs clone https://huggingface.co/rhymes-ai/Allegro custom_nodes/ComfyUI-Allegro/models
git lfs clone https://huggingface.co/rhymes-ai/Allegro-TI2V custom_nodes/ComfyUI-Allegro/ti2v_models
_Alternatively, if local disk space or download time is a concern, download the transformer from Allegro-TI2V only, then share other folders with Allegro_
mkdir -p ti2v_models/transformer/
wget https://huggingface.co/rhymes-ai/Allegro-TI2V/resolve/main/transformer/config.json -O ti2v_models/transformer/config.json
wget https://huggingface.co/rhymes-ai/Allegro-TI2V/resolve/main/transformer/diffusion_pytorch_model.safetensors -O ti2v_models/transformer/diffusion_pytorch_model.safetensors
ln -s custom_nodes/ComfyUI-Allegro/models/vae custom_nodes/ComfyUI-Allegro/ti2v_models/vae
ln -s custom_nodes/ComfyUI-Allegro/models/text_encoder custom_nodes/ComfyUI-Allegro/ti2v_models/text_encoder
ln -s custom_nodes/ComfyUI-Allegro/models/tokenizer custom_nodes/ComfyUI-Allegro/ti2v_models/tokenizer
ln -s custom_nodes/ComfyUI-Allegro/models/scheduler custom_nodes/ComfyUI-Allegro/ti2v_models/scheduler
Drag the following image into comfyui, or click Load for custom_nodes/ComfyUI-Allegro/allegro-comfy-example.json
Results run under comfy
https://github.com/user-attachments/assets/75f90597-7e33-4076-b00f-7ed5d88ea22b
Drag the following image into comfyui, or click Load for custom_nodes/ComfyUI-Allegro/allegro-ti2v-comfy-example.json
ref_images, as an required input to Allegro TextImage2Video Encoder, can be one reference image (starting frame), two reference images (starting and ending frame) or multiple reference images (frame interpolation). Then pass both ref_latents and ref_masks to Allegro TextImage2Video Sampler.WAS-Suite‘s Image Batch to put reference images together.indices parameter further customizes image-to-frame mapping, e.g. 0,10,-1 map the first image to frame 0, the second image to frame 10, and the third image to the last frame.batch parameter in Encoder or Decoder, setting higher value may increase its speed at the risk of GPU OOM.