accelerate aiohttp alias-free-torch audiocraft av cloudpickle colorlog descript-audio-codec descript-audiotools diffusers dora_search einops einops-exts einx flashy kaldiio lameenc nnAudio omegaconf opencv-python openunmix peft pypinyin submitit torch torchaudio torchvision tqdm treetable triton vector_quantize_pytorch x-transformers xformers gguf
SongGeneration:High-Quality Song Generation with Multi-Preference Alignment (SOTA),you can try VRAM>12G
In the ./ComfyUI/custom_nodes directory, run the following:
git clone https://github.com/smthemex/ComfyUI_SongGeneration.git
pip install -r requirements.txt
-- ComfyUI/models/SongGeneration/ # 24.4G all 整个文件夹的大小
|-- htdemucs.pth #150M
|--prompt.pt # 3M
|--new_prompt.pt # 3M
|--new_auto_prompt.pt # v2 version V2版本
|--model_2.safetensors
|--model_2_fixed.safetensors
|--new_model.pt # rename from model.pt #可选
|--large_model.pt # rename from model.pt #可选
|--large_model_v2.pt # must to rename from model.pt #必须重命名才能识别 v2版本
|--large_model_v2_Q8_0.gguf # or Q6,optional 可选
|-- ckpt/
|--encode-s12k.pt # 3.68G
-- ComfyUI/models/vae/
|--autoencoder_music_1320k.ckpt
-- ComfyUI/models/gguf/
|--large_model_v2_Q8_0.gguf # or Q6,optional 可选
@article{lei2025levo,
title={LeVo: High-Quality Song Generation with Multi-Preference Alignment},
author={Lei, Shun and Xu, Yaoxun and Lin, Zhiwei and Zhang, Huaicheng and Tan, Wei and Chen, Hangting and Yu, Jianwei and Zhang, Yixuan and Yang, Chenyu and Zhu, Haina and Wang, Shuai and Wu, Zhiyong and Yu, Dong},
journal={arXiv preprint arXiv:2506.07520},
year={2025}
}