colossalai accelerate diffusers ftfy gdown mmengine pre-commit pyav tensorboard timm tqdm transformers wandb


Another comfy implementation for the short video generation project hpcaitech/Open-Sora. Supports latest V1.2 and V1.1 models as well as image to video functions, etc.
pip install packaging ninja
pip install flash-attn --no-build-isolation
git clone https://www.github.com/nvidia/apex
cd apex
sudo python setup.py install --cuda_ext --cpp_ext
pip3 install -U xformers --index-url https://download.pytorch.org/whl/cu121
cd ComfyUI/custom_nodes
git clone https://github.com/bombax-xiaoice/ComfyUI-Open-Sora-I2V
pip3 install -r ComfyUI-Open-Sora-I2V/requirements.txt
If hpcaitech/Open-Sora standalone mode or chaojie/ComfyUI-Open-Sora has previously runned under the same environment, then opensora may have been installed as a python package, please uninstall it first
pip3 list | grep opensora
pip3 uninstall opensora
| Configuration | Model Version | VAE Version | Text Encoder Version | Frames | Image Size |
| ——————————— | ————- | ———– | ——————– | —— | ———- |
| opensora-v1-2 | STDiT3 | OpenSoraVAE_V1_2 | T5XXL | 2,4,8,16*51 | Many, up to 1280×720 |
| opensora-v1-1 | STDiT2 | VideoAutoEncoderKL | T5XXL | 2,4,8,16*16 | Many |
| opensora | STDiT | VideoAutoEncoderKL | T5XXL | 16,64 | 512×512,256×256 |
| pixart | PixArt | VideoAutoEncoderKL | T5XXL | 1 | 512×512,256×256 |
For opensora-v1-2 and opensora-v1-1 as well as VAEs and t5xxl, model files can be automatically downloaded from huggingface. But for older opensora and pixart, please manually download model files to models/checkpoints/ under comfy home directory
opensora and pixart do not support auto download, download them to models/checkpoints/ under comfy home directory. Then use custom_checkpoint to choose the downloaded folder or file (.json/.safetensors sharing same filename except for their extensions)custom_checkpoint to override the default hpcai-tech/OpenSora-STDiT-v2-stage3 modelcustom_vae and custom_clip directlypositive and negative for Open Sora Sampler, and skip the reference inputnegative_prompt is left emptyopensora-v1-2 (and does not work all the time). One can also put these instructions at the end of positive_prompt directly, in format of f'{positive_prompt}. aesthetic score: {aestheic_score:.1f}. motion score: {motion_sthrenth:.1f}. camera motion: {camera_motion}positive and negative for Open Sora Sampler (prompts encoded as CONDITIONGINGs).positive and negative. Or, if verbal descriptions are still necessary, make sure to make them as consistent to the reference image(s) as possible. Otherwise, video frames may abruptly jump between reference image(s) to verbal generations. May consider applying microsoft/Florence-2-large’s more detailed caption task to generate a base prompt.Drag the following image into comfyui, or open workflow custom_nodes/ComfyUI-OpenSora-I2V/t2v-opensora-v1-2-comfy-example.json
Results run under comfy
https://github.com/user-attachments/assets/350cd72b-e7e0-43dd-be97-a978d9d1b500
Drag the following image into comfyui, or open workflow custom_nodes/ComfyUI-OpenSora-I2V/i2v-opensora-v1-2-comfy-example.json
Results run under comfy
https://github.com/user-attachments/assets/0d2ee49c-4d95-4e45-bc7b-a05141ca038e
reference to Open Sora Sampler shares the same width and height as the loaded model is configured. Use Upscale Image or similar node to resize reference image(s) before running Open Sora Encoder.positive and negative to Open Sora Sampler. In such case, one can set custom_clip as Skip in Loader to spare unnecessary loading time of text encoder.custom_vae as the initializing of checkpoint model has dependencies on VAE.motion_strength (such as 5) can make messy moves and abrupt changes between consecutive frames less likely.opensora code directory derived from the original hpcaitech/Open-Sora project. But utils/inference_utils.py, utils/ckpt_utils.py and a few files under scheduler/ are still modified in order to support comfy features such as seperate text-encoder node, progress bar, preview, etc.