git+https://github.com/arcee-ai/mergekit.git#egg=mergekit lxml












Advanced LoRA merging for ComfyUI with Mergekit integration, supporting 8+ merge algorithms including TIES, DARE, SLERP, and more. Features modular architecture, SVD decomposition, selective layer filtering, and comprehensive validation.
This is an enhanced fork of laksjdjf’s LoRA Merger with extensive refactoring and new features. Core merging algorithms from Mergekit by Arcee AI.
cd ComfyUI/custom_nodes
git clone https://github.com/YourUsername/LoRA-Merger-ComfyUI.git
cd LoRA-Merger-ComfyUI
pip install -r requirements.txt
Requirements:
git+https://github.com/arcee-ai/mergekit.git)Combine multiple LoRAs into a stack for merging. Dynamically adds connection points as you connect LoRAs.
Inputs:
lora_1, lora_2, … lora_N: LoRABundle inputs (unlimited)Output:
LoRAStack: Dictionary mapping LoRA names to their patch dictionariesLoad all LoRAs from a directory automatically with unified strength control.
Parameters:
model: The diffusion model the LoRAs will be applied todirectory: Path to folder containing LoRA filesstrength_model: General model strength applied to all LoRAs (default: 1.0, range: -10.0 to 10.0)strength_clip: General CLIP strength applied to all LoRAs (default: 1.0, range: -10.0 to 10.0)layer_filter: Preset filters (“full”, “attn-only”, “mlp-only”, “attn-mlp”) for architecture-agnostic layer selectionsort_by: “name”, “name descending”, “date”, or “date descending”limit: Limit number of LoRAs to load (default: -1 for all)Features:
Outputs:
LoRAStack: Dictionary mapping LoRA names to their patch dictionariesLoRAWeights: Strength values for each LoRALoRARawDict: Raw LoRA state dictionaries for CLIP weightsDecompose LoRA stack into (up, down, alpha) tensor components for merging.
Features:
Parameters:
key_dicts: Input LoRAStackdecomposition_method: Choose from Standard SVD, Randomized SVD, or Energy-Based Randomized SVDsvd_rank: Target rank for decomposition (0 for full rank)’device: Processing device (“cpu”, “cuda”)Outputs:
components: LoRATensors (decomposed tensors by layer)Main merging node using Mergekit algorithms. Processes layers in parallel with thread-safe progress tracking.
Parameters:
merge_method: MergeMethod configuration from method nodescomponents: Decomposed LoRATensors from decompose nodestrengths: LoRAWeights from decompose node_lambda: Final scaling factor (default: 1.0)device: Processing device (“cpu”, “cuda”)dtype: Computation precision (“float32”, “float16”, “bfloat16”)Features:
Output:
Apply merged LoRA to a model.
Inputs:
model: ComfyUI model to patchlora: Merged LoRA from mergerOutputs:
model: Patched modelSave merged LoRA to disk in standard format. This will also save the original clip weights if present.
Parameters:
lora: Merged LoRA to savefilename: Output filename (without extension)Each method node configures algorithm-specific parameters. Connect to the merge_method input of PM LoRA Merger.
Simple weighted linear combination.
Parameters:
normalize (bool): Normalize by number of LoRAs (default: True)Task Arithmetic with Interference Elimination and Sign consensus.
Parameters:
density (float): Fraction of values to keep (0.0-1.0, default: 0.9)normalize (bool): Normalize merged result (default: True)Reference: TIES-Merging Paper
Drop And REscale for efficient model merging.
Parameters:
density (float): Probability of keeping each parameter (default: 0.9)normalize (bool): Normalize after rescaling (default: True)Reference: DARE Paper
Depth-Enhanced Low-rank adaptation with Layer-wise Averaging.
Parameters:
density (float): Layer density parameter (default: 0.9)epsilon (float): Small value for numerical stability (default: 1e-8)lambda_factor (float): Scaling factor (default: 1.0)Breadcrumb-based merging strategy.
Parameters:
density (float): Path density (default: 0.9)tie_method (“sum” or “mean”): How to combine tied parametersSpherical Linear Interpolation for smooth model interpolation.
Parameters:
t (float): Interpolation factor (0.0-1.0, default: 0.5)Note: SLERP requires exactly 2 LoRAs. For multiple LoRAs, use PM NuSLERP or PM Karcher.
Normalized Spherical Linear Interpolation for multiple models.
Parameters:
normalize (bool): Normalize result to unit sphere (default: True)Karcher mean on the manifold (generalized SLERP for N models).
Parameters:
max_iterations (int): Maximum optimization iterations (default: 100)tolerance (float): Convergence threshold (default: 1e-6)Standard task vector arithmetic (delta merging).
Parameters:
normalize (bool): Normalize by number of models (default: False)Selective consensus with threshold-based parameter selection.
Parameters:
threshold (float): Consensus threshold (default: 0.5)Nearest neighbor parameter swapping.
Parameters:
distance_metric (“cosine” or “euclidean”): Distance measureApply block-wise scaling to LoRA weights for fine-grained control over different network layers.
Inputs:
key_dicts: LoRAStack to modifyblocks_store: JSON string containing block scale configurationFeatures:
Block Scale Format:
The blocks_store parameter expects a JSON string with the following structure:
{
"mode": "sdxl_unet",
"blockScales": {
"input_blocks.0": 1.0,
"input_blocks.1": 0.8,
"middle_block.1": 1.2,
"output_blocks.0": 0.9
}
}
Supported Architectures:
"sdxl_unet": Stable Diffusion XL UNet blocks"sd_unet": Stable Diffusion UNet blocks"dit": Diffusion Transformer blocksUse Cases:
Output:
Resize all layers in a LoRA to a different rank using tensor decomposition methods.
Parameters:
lora (LoRABundle): Input LoRA to resizedecomposition_method: Decomposition strategy for rank adjustment"SVD": Full singular value decomposition (slow but optimal)"rSVD": Randomized SVD (fast, recommended for most cases)"energy_rSVD": Energy-based randomized SVD (best for DiT/large LoRAs)new_rank (int): Target rank for all layers (default: 16, range: 1-128)device: Processing device (“cuda” or “cpu”)dtype: Computation precision (“float32”, “float16”, “bfloat16”)Features:
adjust_tensor_dimsOutput:
{original_name}_r{new_rank}Use Cases:
Note: Uses asymmetric singular value distribution (all S values in up matrix), which differs from the symmetric distribution used in lora_decompose.
Sample different block configurations for layer-wise experiments.
Parameters:
model: The diffusion model the LoRAs will be applied topositive: Positive conditioningnegative: Negative conditioningsampler: ComfyUI sampler to usesigmas: Sigma schedule for samplinglatent_image: Initial latent imagelora: LoRABundle to sample fromvae: VAE model for decodingadd_noise: Boolean if to add noise to latent imagenoise_seed: Seed for noise generationcontrol_after_generate: Select control strategy after generationblock_sampling_mode: Sampling mode for blocks (“round_robin_exclude”, “round_robin_include”)image_display: Whether to display generated images or display the differential image in comparison to the base imageIteratively sample images using each LoRA from a stack individually, generating comparison grids.
Parameters:
model: The diffusion model the LoRAs will be applied tovae: VAE model for decoding latents to imagesadd_noise: Whether to add noise to latent image (default: True)noise_seed: Random seed for noise generation (0 to 2^64-1)cfg: Classifier-free guidance scale (default: 8.0, range: 0.0-100.0)positive: Positive conditioning promptnegative: Negative conditioning promptsampler: ComfyUI sampler to use (e.g., KSampler, DPM++)sigmas: Sigma schedule for noise levelslatent_image: Initial latent image to denoiselora_key_dicts: LoRAStack from decompose or stacker nodeslora_strengths: LoRAWeights containing strength values for each LoRAFeatures:
Outputs:
latents: Concatenated latents for all sampled imagesimages: Individual annotated images as separate outputsimage_grid: Combined grid view with all samples organized by LoRA and batchUse Cases:
Workflow Example:
LoRA Stacker → LoRA Stack Decompose → LoRA Stack Sampler → Image Grid
(3 LoRAs) (with model, VAE, prompts)
Systematically sweep through merge parameter values to find optimal settings visually.
Parameters:
model: The diffusion model the merged LoRAs will be applied tovae: VAE model for decoding latents to imagesadd_noise: Whether to add noise to latent image (default: True)noise_seed: Random seed for noise generation (0 to 2^64-1)cfg: Classifier-free guidance scale (default: 8.0, range: 0.0-100.0)positive: Positive conditioning promptnegative: Negative conditioning promptsampler: ComfyUI sampler to usesigmas: Sigma schedule for noise levelslatent_image: Initial latent image to denoisemerge_context: MergeContext output from PM LoRA Merger (Mergekit) nodeparameter_name: Name of the merge parameter to sweep (e.g., “t”, “density”, “normalize”)parameter_values: Parameter values to test (see formats below)parameter_name_2: Optional second parameter for 2D sweeps (leave empty for 1D)parameter_values_2: Second parameter values (same formats as parameter_values)Parameter Value Formats:
The parameter_values and parameter_values_2 fields support multiple input formats:
"min - max | num_points""0.25 - 0.75 | 3" → [0.25, 0.5, 0.75]"min - max : step""0.25 - 0.75 : 0.25" → [0.25, 0.5, 0.75]"val1, val2, val3""0.25, 0.5, 0.75" → [0.25, 0.5, 0.75]"0.5""0.5" → [0.5]parameter_values input is ignored for boolean parametersFeatures:
Outputs:
latents: Stacked latents for all parameter combinationsimage_grid: Annotated comparison grid (horizontal for 1D, rows×cols for 2D)Use Cases:
SLERP Interpolation Sweep:
parameter_name: "t"
parameter_values: "0.0 - 1.0 | 5"
→ Tests t=[0.0, 0.25, 0.5, 0.75, 1.0] to find optimal interpolation point
TIES Density Optimization:
parameter_name: "density"
parameter_values: "0.5, 0.7, 0.9"
→ Compares sparse (0.5), medium (0.7), and dense (0.9) merges
DARE Drop Rate Analysis:
parameter_name: "density"
parameter_values: "0.5 - 0.95 : 0.15"
→ Tests density=[0.5, 0.65, 0.8, 0.95] to balance efficiency vs quality
Boolean Parameter Testing:
parameter_name: "normalize"
parameter_values: "" (ignored)
→ Automatically tests [False, True] to compare normalized vs unnormalized
2D Parameter Sweep (Dual Mode):
parameter_name: "density"
parameter_values: "0.5, 0.7, 0.9"
parameter_name_2: "k"
parameter_values_2: "16, 32, 64"
→ Generates 3 × 3 = 9 images testing all combinations
Workflow Example:
LoRA Stack → Decompose → Method Node (SLERP) → Merger (with MergeContext output)
↓
Parameter Sweep Sampler
(sweep "t" from 0 to 1)
↓
Annotated Image Grid
Notes:
Advanced stacking with per-LoRA configuration, dynamic input management, and intelligent search.
Features:
Search Bar Usage:
Benefits:
The project includes a comprehensive tensor decomposition system with multiple strategies:
Full singular value decomposition for exact low-rank approximation.
Use case: High accuracy, small to medium tensors
Fast approximate SVD using randomized linear algebra.
Use case: Large tensors where speed is critical
Adaptive SVD that automatically selects rank based on energy threshold.
Use case: Automatic rank selection with quality guarantees
All decomposers include:
Selective merging allows targeting specific layer types with architecture-agnostic presets that work seamlessly with both Stable Diffusion and DiT (Diffusion Transformer) LoRAs.
"full": All layers (no filter)"attn-only": Only attention layers"mlp-only": Only MLP/feedforward layers"attn-mlp": Attention + MLP layers combinedYou can also create custom filters by providing a set of layer component names:
from src.utils import LayerFilter
filter = LayerFilter({"attn1", "proj_in", "proj_out"})
filtered_patches = filter.apply(lora_patches)
Use Cases:
"attn-only" to merge only attention mechanisms for composition/style"mlp-only" to merge only feedforward layers for textures/details"attn-mlp" to exclude projection and normalization layers"full" to merge all layer typesSpectral norm regularization prevents any single layer from dominating the merge due to large weight magnitudes, leading to more stable and balanced merges.
The spectral norm of a matrix is its maximum singular value, representing the Lipschitz constant of the linear transformation. In the context of LoRA merging, it measures how much a layer can amplify or attenuate signals.
The regularization process uses per-layer clipping:
This prevents outlier layers from having excessive magnitude while preserving the overall LoRA effect for layers with reasonable magnitudes. Unlike global scaling (which would reduce all layers proportionally), per-layer clipping only affects layers that exceed the target.
from src.utils.spectral_norm import apply_spectral_norm
# Apply spectral norm regularization to LoRA weights
regularized_lora = apply_spectral_norm(
lora_patches,
scale=1.0, # Target maximum spectral norm
num_iter=10, # Power iteration count (higher = more accurate)
device=device
)
The system automatically detects LoRA architecture and applies appropriate decomposition and filtering strategies. Currently supports 6 major architectures:
UNet architecture with underscore-separated naming:
down_blocks, up_blocks, mid_blockattn1 (self-attention), attn2 (cross-attention)ff layerstext_model layerslora_unet_down_blocks_0_attentions_0_transformer_blocks_0_attn1_to_qFlat transformer with dot-separated naming:
diffusion_model.layers.Nattention (qkv projections)mlp, feed_forwarddiffusion_model.layers.13.attention.qkv.weightModern architecture with dual-block structure:
double_blocks (parallel image/text processing)single_blocks (unified processing)img_attn_proj, img_attn_qkvtxt_attn_proj, txt_attn_qkvimg_mlp_0, img_mlp_2txt_mlp_0, txt_mlp_2lora_unet_double_blocks_0_img_attn_proj.alphaTransformer with dot-separated naming and dual attention:
diffusion_model.blocks.Nself_attn.q, self_attn.k, self_attn.vcross_attn.q, cross_attn.k, cross_attn.vffn (w1, w2, w3)diffusion_model.blocks.0.cross_attn.q.lora_A.weightTransformer blocks with attention and dual MLPs:
transformer_blocks.N.attn. (to_k, to_q, to_v, add_k_proj, add_q_proj, add_v_proj)img_mlptxt_mlptransformer_blocks.0.attn.to_k.alphaPure DiT architecture with adaptive layer normalization:
diffusion_model.layers.Nattention.to_k, attention.to_q, attention.to_vfeed_forward (w1, w2, w3)adaLN_modulation (unique to zImage)diffusion_model.layers.0.attention.to_k.lora_A.weightThe layer filter system automatically detects architecture on LoRA load and logs the result:
Detected Wan 2.2 architecture (400 keys)
Detected Flux architecture (584 keys)
Detected Qwen Image Edit architecture (272 keys)
This enables architecture-agnostic preset filters ("attn-only", "mlp-only", "attn-mlp") to work seamlessly across all supported architectures.
MIT License
Copyright (c) 2024 LoRA Power-Merger Contributors
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the “Software”), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.