decord==0.6.0 einops==0.8.1 huggingface_hub==0.29.1 matplotlib==3.7.0 numpy==1.24.4 opencv_python==4.7.0.72 pyarrow==11.0.0 PyYAML==6.0.2 Requests==2.32.3 safetensors==0.4.5 scipy==1.10.1 sentencepiece==0.1.99 torch==2.5.1 torchvision==0.20.1 transformers==4.49.0 flash_attn==2.5.8 accelerate>=0.34.0 wandb
Using different ID migration methods to make storys in ComfyUI
Origin methods from:
1.Installation
In the’./ComfyUI /custom_nodes ‘ directory, run the following:
git clone https://github.com/smthemex/ComfyUI_StoryDiffusion.git
2.requirements
pip install -r requirements.txt
pip install insightface
3 models
3.1 stroy _diffusion mode (单纯故事)
├── ComfyUI/models/checkpoints/
| ├── juggernautXL_v8Rundiffusion.safetensors
下载 download photomaker-v1.bin or 或者 photomaker-v2.bin
├── ComfyUI/models/photomaker/
| ├── photomaker-v1.bin or photomaker-v2.bin
3.2 MS-diffusion mode(2 role in 1 imag 双角色同框)
├── ComfyUI/models/
| ├── photomaker/ms_adapter.bin
| ├── clip_vision/clip_vision_g.safetensors(2.35G) or CLIP-ViT-bigG-14-laion2B-39B-b160k.safetensors(3.43G)
├── ComfyUI/models/controlnet/
| ├──xinsir/controlnet-openpose-sdxl-1.0
| ├──... 其他类似的
3.3 kolors face mode(不再支持IP,已修复高版本错误)
├── ComfyUI/models
| ├── /photomaker/ipa-faceid-plus.bin
| ├── clip/chatglm3-8bit.safetensors
| ├── clip_vision/clip-vit-large-patch14.safetensors # Kolors-IP-Adapter-Plus or Kolors-IP-Adapter-FaceID-Plus using same checkpoints.
├── any path/Kwai-Kolors/Kolors
| ├──model_index.json
| ├──vae
| ├── config.json
| ├── diffusion_pytorch_model.safetensors (rename from diffusion_pytorch_model.fp16.safetensors )
| ├──unet
| ├── config.json
| ├── diffusion_pytorch_model.safetensors (rename from diffusion_pytorch_model.fp16.safetensors )
| ├──tokenizer
| ├── tokenization_chatglm.py ##新版,修复高版本diffuser错误
| ├── ... #all 所有文件
| ├── text_encoder
| ├── modeling_chatglm.py #新版,修复高版本diffuser错误
| ├── tokenization_chatglm.py ##新版,修复高版本diffuser错误
| ├── ... #all 所有文件
| ├── scheduler
| ├── scheduler_config.json
3.4 flux_pulid mode .
pip install -U optimum-quanto
├── ComfyUI/models/
| ├── photomaker/pulid_flux_v0.9.0.safetensors
| ├── clip_vision/EVA02_CLIP_L_336_psz14_s6B.pt
| ├── diffusion_models/flux1-dev-fp8.safetensors
├── ComfyUI/models/clip/
| ├── t5xxl_fp8_e4m3fn.safetensors
| ├── clip_l.safetensors
3.5 storymake mode
下载 download mask.bin#可以自动下载 buffalo_l#自动下载 RMBG-1.4#自动下载
├── ComfyUI/models/
| ├── photomaker/mask.bin
| ├── clip_vision/clip_vision_H.safetensors #2.4G base in laion/CLIP-ViT-H-14-laion2B-s32B-b79K
├── ComfyUI/models/buffalo_l/
| ├── 1k3d68.onnx
| ├── ...
3.6 InfiniteYou mode
├── any_path/FLUX.1-dev/transformer
| ├── config.json
| ├──diffusion_pytorch_model-00001-of-00003.safetensors
| ├──diffusion_pytorch_model-00002-of-00003.safetensors
| ├──diffusion_pytorch_model-00003-of-00003.safetensors
| ├── diffusion_pytorch_model.safetensors.index.json
or
├── ComfyUI/models/
| ├── diffusion_models/flux1-dev-fp8.safetensors #
├── any_path/sim_stage1/
| ├── image_proj_model.bin
| ├── InfuseNetModel/
| ├── diffusion_pytorch_model-00001-of-00002.safetensors
| ├── diffusion_pytorch_model-00002-of-00002.safetensors
| ├── diffusion_pytorch_model.safetensors.index.json
| ├── config.json
or
├── any_path/aes_stage2/
| ├── ...
├── ComfyUI/models/antelopev2/
| ├──1k3d68.onnx
| ├──...
download gguf from here,and fill local path in ‘easyfunction_lite’ node’s ‘select_method’
├── ComfyUI/models/gguf
| ├── flux1-dev-Q8_0.gguf #flux1-dev-Q6_K.gguf
download svdquant repo from here and fill local path in ‘easyfunction_lite’ node’s ‘select_method’
3.7 UNO mode
download lora dit_lora.safetensor,use fp8,if Vram <24.
├── ComfyUI/models/
| ├── diffusion_models/flux1-dev.safetensors #
| ├── loras/dit_lora.safetensors #
3.8 RealCustom mode
download all bytedance-research/RealCustom 可能要连外网
├── ComfyUI/models/
| ├── diffusion_models/sdxl-unet.bin #
| ├── photomaker/RealCustom_highres.pth #
| ├── clip/clip_l #normal 常规的不用重复下
| ├── clip/clip_g # normal 常规的不用重复下
| ├── clipvison/vit_so400m_patch14_siglip_384.bin #vit_so400m_patch14_siglip_384
| ├── clipvison/vit_large_patch14_reg4_dinov2.bin #vit_large_patch14_reg4_dinov2.lvd142m
3.9 InstantCharacter mode
download instantcharacter_ip-adapter.bin
repo:google/siglip-so400m-patch14-384 and repo:facebook/dinov2-giant
├── ComfyUI/models/photomaker/instantcharacter_ip-adapter.bin
├── anypath/google/siglip-so400m-patch14-384
├── anypath/facebook/dinov2-giant
3.10 DreamO mode
download dreamo
flux repo: flux
ben2 pth :BEN2_Base.pth or auto 或者自动下载
turbo lora:alimama-creative/FLUX.1-Turbo-Alpha
├── ComfyUI/models/loras/
├──dreamo_cfg_distill.safetensors
├──dreamo.safetensors
├──dreamo_quality_lora_neg.safetensors #optional 可选,v1.0 没有也能用,与上两个lora在一个目录即可
├──dreamo_quality_lora_pos.safetensors #optional 可选,v1.0 没有也能用,与上两个lora在一个目录即可
├──dreamo_dpo_lora.safetensors # optional 可选 v1.1,没有也能用,与上两个lora在一个目录即可
├──dreamo_sft_lora.safetensors # optional 可选,v1.1,没有也能用,与上两个lora在一个目录即可
├── ComfyUI/models/photomaker/
├──FLUX.1-Turbo-Alpha.safetensors #rename 重命名的turbo lora
├── anypath/black-forest-labs/FLUX.1-dev
├── ComfyUI/models/BEN2_Base.pth #or any path
3.11 Bagel mode
download BAGEL-7B-MoT
├── ComfyUI/models/vae/
├──ae.safetensors # flux or BAGEL-7B-MoT
├── Any/path/ByteDance-Seed/BAGEL-7B-MoT/
├──all files # 所有文件
3.12 OmniConsistency mode
flux repo: flux
├── ComfyUI/models/photomaker/
├──OmniConsistency.safetensors #
├── ComfyUI/models/loras/
├── any flux loras
3.13 Qwen-Image mode
Qwen-Image-Edit:QuantStack/Qwen-Image-Edit-GGUF #Q6 Q8 if lowVRAM Q4
Qwen-Image : city96/Qwen-Image-gguf #Q6 Q8 if lowVRAM Q4
text-encoder : Comfy-Org/Qwen-Image_ComfyUI # fp8 or fp16
vae :Comfy-Org/Qwen-Image_ComfyUI #
lighting-lora :ightx2v/Qwen-Image-Lightning # optional 可选
├── ComfyUI/models/gguf/
├──qwen-image-edit-q6_k.gguf # or Q8,q5,q4
├──qwen-image-Q8_0.gguf # or q6,q5,q4
├── ComfyUI/models/loras/
├── Qwen-Image-Lightning-4steps-V1.0-bf16.safetensors
├── Qwen-Image-Edit-Lightning-8steps-V1.0-bf16.safetensors
├── ComfyUI/models/clip/
├── qwen_2.5_vl_7b_fp8_scaled.safetensors
├── ComfyUI/models/vae/
├── qwen_image_vae.safetensors
4 Example
4.1 story-diffusion
4.2 ms-diffusion
4.3 story-maker or story-and-maker
4.4 consistory
4.5 kolor-face
4.6 pulid-flux
4.7 infiniteyou
4.8 UNO
4.9 RealCustom
4.10 InstantCharacter
4.11 DreamO
4.12 Bagel
4.13 OmniConsistency
4.13 Qwen-Image & Eidt
4.15 comfyUI classic(comfyUI经典模式,可以接任意适配CF的流程,主要是方便使用多角色的clip)
5 Citation
StoryDiffusion
@article{zhou2024storydiffusion,
title={StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation},
author={Zhou, Yupeng and Zhou, Daquan and Cheng, Ming-Ming and Feng, Jiashi and Hou, Qibin},
journal={arXiv preprint arXiv:2405.01434},
year={2024}
}
IP-Adapter
@article{ye2023ip-adapter,
title={IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models},
author={Ye, Hu and Zhang, Jun and Liu, Sibo and Han, Xiao and Yang, Wei},
booktitle={arXiv preprint arxiv:2308.06721},
year={2023}
}
MS-Diffusion
@misc{wang2024msdiffusion,
title={MS-Diffusion: Multi-subject Zero-shot Image Personalization with Layout Guidance},
author={X. Wang and Siming Fu and Qihan Huang and Wanggui He and Hao Jiang},
year={2024},
eprint={2406.07209},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
photomaker
@inproceedings{li2023photomaker,
title={PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding},
author={Li, Zhen and Cao, Mingdeng and Wang, Xintao and Qi, Zhongang and Cheng, Ming-Ming and Shan, Ying},
booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2024}
}
kolors
@article{kolors,
title={Kolors: Effective Training of Diffusion Model for Photorealistic Text-to-Image Synthesis},
author={Kolors Team},
journal={arXiv preprint},
year={2024}
}
PuLID
@article{guo2024pulid,
title={PuLID: Pure and Lightning ID Customization via Contrastive Alignment},
author={Guo, Zinan and Wu, Yanze and Chen, Zhuowei and Chen, Lang and He, Qian},
journal={arXiv preprint arXiv:2404.16022},
year={2024}
}
Consistory
@article{tewel2024training,
title={Training-free consistent text-to-image generation},
author={Tewel, Yoad and Kaduri, Omri and Gal, Rinon and Kasten, Yoni and Wolf, Lior and Chechik, Gal and Atzmon, Yuval},
journal={ACM Transactions on Graphics (TOG)},
volume={43},
number={4},
pages={1--18},
year={2024},
publisher={ACM New York, NY, USA}
}
infiniteyou
@article{jiang2025infiniteyou,
title={{InfiniteYou}: Flexible Photo Recrafting While Preserving Your Identity},
author={Jiang, Liming and Yan, Qing and Jia, Yumin and Liu, Zichuan and Kang, Hao and Lu, Xin},
journal={arXiv preprint},
volume={arXiv:2503.16418},
year={2025}
}
svdquant
@inproceedings{
li2024svdquant,
title={SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models},
author={Li*, Muyang and Lin*, Yujun and Zhang*, Zhekai and Cai, Tianle and Li, Xiuyu and Guo, Junxian and Xie, Enze and Meng, Chenlin and Zhu, Jun-Yan and Han, Song},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025}
}
@article{wu2025less,
title={Less-to-More Generalization: Unlocking More Controllability by In-Context Generation},
author={Wu, Shaojin and Huang, Mengqi and Wu, Wenxu and Cheng, Yufeng and Ding, Fei and He, Qian},
journal={arXiv preprint arXiv:2504.02160},
year={2025}
}
@inproceedings{huang2024realcustom,
title={RealCustom: narrowing real text word for real-time open-domain text-to-image customization},
author={Huang, Mengqi and Mao, Zhendong and Liu, Mingcong and He, Qian and Zhang, Yongdong},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={7476--7485},
year={2024}
}
@article{mao2024realcustom++,
title={Realcustom++: Representing images as real-word for real-time customization},
author={Mao, Zhendong and Huang, Mengqi and Ding, Fei and Liu, Mingcong and He, Qian and Zhang, Yongdong},
journal={arXiv preprint arXiv:2408.09744},
year={2024}
}
@article{wu2025less,
title={Less-to-More Generalization: Unlocking More Controllability by In-Context Generation},
author={Wu, Shaojin and Huang, Mengqi and Wu, Wenxu and Cheng, Yufeng and Ding, Fei and He, Qian},
journal={arXiv preprint arXiv:2504.02160},
year={2025}
}
@article{tao2025instantcharacter,
title={InstantCharacter: Personalize Any Characters with a Scalable Diffusion Transformer Framework},
author={Tao, Jiale and Zhang, Yanbing and Wang, Qixun and Cheng, Yiji and Wang, Haofan and Bai, Xu and Zhou, Zhengguang and Li, Ruihuang and Wang, Linqing and Wang, Chunyu and others},
journal={arXiv preprint arXiv:2504.12395},
year={2025}
}
@article{deng2025bagel,
title = {Emerging Properties in Unified Multimodal Pretraining},
author = {Deng, Chaorui and Zhu, Deyao and Li, Kunchang and Gou, Chenhui and Li, Feng and Wang, Zeyu and Zhong, Shu and Yu, Weihao and Nie, Xiaonan and Song, Ziang and Shi, Guang and Fan, Haoqi},
journal = {arXiv preprint arXiv:2505.14683},
year = {2025}
}
@inproceedings{Song2025OmniConsistencyLS,
title={OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data},
author={Yiren Song and Cheng Liu and Mike Zheng Shou},
year={2025},
url={https://api.semanticscholar.org/CorpusID:278905729}
}
@misc{wu2025qwenimagetechnicalreport,
title={Qwen-Image Technical Report},
author={Chenfei Wu and Jiahao Li and Jingren Zhou and Junyang Lin and Kaiyuan Gao and Kun Yan and Sheng-ming Yin and Shuai Bai and Xiao Xu and Yilei Chen and Yuxiang Chen and Zecheng Tang and Zekai Zhang and Zhengyi Wang and An Yang and Bowen Yu and Chen Cheng and Dayiheng Liu and Deqing Li and Hang Zhang and Hao Meng and Hu Wei and Jingyuan Ni and Kai Chen and Kuan Cao and Liang Peng and Lin Qu and Minggang Wu and Peng Wang and Shuting Yu and Tingkun Wen and Wensen Feng and Xiaoxiao Xu and Yi Wang and Yichang Zhang and Yongqiang Zhu and Yujia Wu and Yuxuan Cai and Zenan Liu},
year={2025},
eprint={2508.02324},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2508.02324},
}
@misc{von-platen-etal-2022-diffusers,
author = {Patrick von Platen and Suraj Patil and Anton Lozhkov and Pedro Cuenca and Nathan Lambert and Kashif Rasul and Mishig Davaadorj and Dhruv Nair and Sayak Paul and William Berman and Yiyi Xu and Steven Liu and Thomas Wolf},
title = {Diffusers: State-of-the-art diffusion models},
year = {2022},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/huggingface/diffusers}}
}
@misc{lightx2v,
author = {LightX2V Contributors},
title = {LightX2V: Light Video Generation Inference Framework},
year = {2025},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/ModelTC/lightx2v}},
}