sd-lora-trainer

sd-lora-trainer
★ 69

支持SDXL支持SDv1.5LoRa训练兼容ComfyUI/AUTO1111
由Eden.art维护的高性能训练器,适用于SDXL与SDv1.5,支持LoRa与全量微调,单脚本统一损失,输出兼容ComfyUI与AUTO1111。
💡 在ComfyUI中快速训练LoRa或进行模型全量微调。
🍴 12 Forks💻 Python🔄 2025-08-04
📦
网盘下载
复制链接后前往夸克网盘下载
https://pan.quark.cn/s/2df45d172dc1
📦 requirements.txt
torch==2.1.0
torchaudio==2.1.0
torchvision==0.16.0
transformers==4.38.0
diffusers==0.26.0
tokenizers==0.15.2
huggingface-hub==0.22.2
ujson==5.10.0
scipy==1.14.0
peft==0.10.0
invisible-watermark==0.2.0
pandas==2.2.1
numpy==1.26.4
opencv-python==4.10.0.84
mediapipe==0.10.14
openai==1.35.13
python-dotenv==1.0.1
prodigyopt==1.0
omegaconf==2.3.0
ujson==5.10.0
bitsandbytes==0.43.1
setuptools==70.3.0
📄 README

Trainer

This trainer was developed by the Eden team, you can try our hosted version of the trainer in our app.

It’s a highly optimized trainer that can be used for both full finetuning and training LoRa modules on top of Stable Diffusion.

It uses a single training script and loss module that works for both SDv15 and SDXL!

The outputs of this trainer are fully compatible with ComfyUI and AUTO111, see documentation here.

A full guide on training can be found in our docs.

Training images:

Generated imgs with trained LoRa:

The trainer can be run in 4 different ways:

  • as a hosted service on our website
  • as a hosted service through replicate
  • as a ComfyUI node
  • as a standalone python script
  • Using in ComfyUI:

  • Example workflows for how to run the trainer and do inference with it can be found in /ComfyUI_workflows
  • Importantly this trainer uses a chatgpt call to cleanup the auto-generated prompts and inject the trainable token, this will only work if you have a .env file containing your OPENAI key in the root of the repo dir that contains a single line: OPENAI_API_KEY=your_key_string Everything will work without this, but results will be better if you set this up, especially for ‘face’ and ‘object’ modes.
  • The trainer supports 3 default modes:

  • style: used for learning the aesthetic style of a collection of images.
  • face: used for learning a specific face (can be human, character, …).
  • object: will learn a specific object or thing featured in the training images.
  • Style training example:

    Setup

    Install all dependencies using

    pip install -r requirements.txt

    then you can simply run:

    python main.py train_configs/training_args.json

    to start a training job.

    Adjust the arguments inside training_args.json to setup a custom training job.


    You can also run this through Replicate using cog (~docker image):

  • Install Replicate ‘cog’:
  • sudo curl -o /usr/local/bin/cog -L "https://github.com/replicate/cog/releases/latest/download/cog_$(uname -s)_$(uname -m)"
    sudo chmod +x /usr/local/bin/cog

  • Build the image with cog build
  • Run a training run with sh cog_test_train.sh
  • You can also go into the container with cog run /bin/bash
  • Full unet finetuning

    When running this trainer in native python, you can also perform full unet finetuning using something like (adjust to your needs)

    python main.py train_configs/full_finetuning_example.json

    TODO’s

    Bugs:

  • pure textual inversion for SD15 does not seem to work well… (but it works amazingly well for SDXL…) —> if anyone can figure this one out I’d be forever grateful!
  • figure out why training is 3x slower through comfyui node versus just running main.py as a python job..?
  • Fix aspect_ratio bucketing in the dataloader (see https://github.com/kohya-ss/sd-scripts)
  • Bigger improvements:

  • integrate Flux / SD3
  • Add multi-concept training (multiple things represented by multiple tokens, trained into a single LoRa)
  • add stronger token regularization (eg CelebBasis spanning basis)
  • implement perfusion ideas (key locking with superclass): https://research.nvidia.com/labs/par/Perfusion/
  • implement prompt-aligned: https://prompt-aligned.github.io/