ComfyUI-Unofficial-HSWQ-QuantizerHSWQ-Quantizer

ComfyUI-Unofficial-HSWQ-QuantizerHSWQ-Quantizer
★ 3

模型量化HSWQFP8 转换校准与基准测试
为 ComfyUI 提供 HSWQ(Hybrid Sensitivity Weighted Quantization)非官方实现,包含校准与 FP8 转换节点,便于在工作流中优化模型并评估精度。
💡 在 ComfyUI 中收集校准并将模型转换为 FP8 以优化推理与内存。
🍴 2 Forks💻 Python🔄 2026-02-25
📦
网盘下载
复制链接后前往夸克网盘下载
https://pan.quark.cn/s/e98a62d17551
📄 README

ComfyUI-Unofficial-HSWQ-QuantizerHSWQ-Quantizer

Unofficial ComfyUI reference implementation of Hybrid Sensitivity Weighted Quantization (HSWQ)

Overview

This repository provides an unofficial reference implementation of Hybrid Sensitivity Weighted Quantization (HSWQ) for ComfyUI.

The original HSWQ method and core algorithm were proposed and released by:

👉 https://github.com/ussoewwin/Hybrid-Sensitivity-Weighted-Quantization

Note: This project does not modify the original algorithm. Its purpose is to make HSWQ practically usable inside ComfyUI workflows.

It provides:

  • A calibration node: Collects HSWQ statistics during normal image generation.
  • A conversion node: Applies V1-compatible FP8 quantization using the collected statistics.
  • This implementation is intended as a practical integration / reference, not as an alternative or competing implementation.


    What This Implementation Adds

    Compared to the original scripts, this repository focuses on workflow-level integration:

    ComfyUI Custom Nodes

  • Calibration (statistics collection): Hooks into the generation process.
  • FP8 conversion: Converts models directly within ComfyUI.
  • Session-Aware Calibration

  • Accumulation: Statistics can be accumulated across multiple runs.
  • Safe Saving: Uses atomic saving to avoid corrupted stats files.
  • V1-Compatible FP8 Conversion

  • Smart Layer Selection: Keeps top-k sensitive layers in FP16.
  • Optimization: Applies weighted histogram MSE optimization for FP8 amax selection.
  • All algorithmic decisions follow the design described in the original repository.


    Scope and Non-Goals

    ✅ In Scope

  • Practical ComfyUI integration
  • Reference implementation for real workflows
  • Faithful reproduction of HSWQ V1 behavior
  • ❌ Out of Scope

  • Proposing new quantization algorithms
  • Changing HSWQ theory or selection criteria
  • Replacing the original implementation

  • Installation

    Clone this repository into your ComfyUI custom_nodes directory:

    cd ComfyUI/custom_nodes
    git clone [https://github.com/](https://github.com/)<yourname>/ComfyUI-HSWQ-Quantizer

    Please restart ComfyUI after installation.

    Dependencies

    This project relies on PyTorch with FP8 support and includes optional dependencies for the benchmark node:

  • Required: torch with torch.float8_e4m3fn support
  • Benchmark node: lpips, open_clip_torch
  • Install the benchmark dependencies if you plan to use the benchmark node:

    pip install lpips open_clip_torch

    Provided Nodes

    1. HSWQ Calibration (Dual Monitor V2)

    Collects calibration statistics while running standard SDXL generation.

    Key features:

  • Hooks into UNet forward passes (Linear/Conv2d only).
  • Tracks:
  • Output sensitivity (variance) in FP32, accumulated as Python float.
  • Input channel importance (mean absolute value per channel).
  • Session-based accumulation with load/restore on disk.
  • Atomic checkpointing via .tmp + os.replace.
  • Wrapper-based enable/disable so normal generation remains unaffected.
  • Typical usage:

  • Insert the calibration node into your SDXL workflow.
  • Run generation multiple times.
  • Statistics are saved automatically as .pt files.
  • 2. SDXL HSWQ FP8 Quantizer (Spec-aligned)

    Converts an SDXL UNet model to FP8 using collected calibration statistics.

    Conversion process:

  • Load calibration statistics.
  • Rank layers by sensitivity.
  • Keep top keep_ratio layers in FP16.
  • Quantize remaining layers to FP8 (torch.float8_e4m3fn).
  • Optimize amax using weighted histogram MSE (HSWQ V1).
  • Notable behavior (current implementation):

  • Optional scaled mode (default False) for spec-aligned V1 behavior.
  • Optional comfy_quant and weight_scale buffer injection to help
  • downstream loaders interpret FP8 weights.

  • Skips layers without stats or already in FP8, and normalizes BF16 → FP16
  • for protected layers.

  • hswq_stats_path is resolved relative to ComfyUI output directory when possible.
  • The output model remains compatible with standard ComfyUI loaders.

    3. HSWQ FP8 Converter (Legacy V1.2 Logic)

    Legacy node for compatibility comparisons with earlier behavior.

    Legacy constraints:

  • Fixed optimizer settings (bins/candidates/refinement).
  • scaled=False enforced (clip → cast only).
  • No comfy_quant metadata injection.
  • Uses the same output-variance sensitivity ranking and input importance when available.
  • 4. HSWQ Advanced Benchmark

    Provides a benchmark node for comparing output fidelity across FP8/FP16 models.

    This node requires the lpips and open_clip_torch packages.


    Recommended Settings

    These settings follow the guidance from the original HSWQ repository:

    | Parameter | Typical value | Description |

    | :— | :— | :— |

    | Calibration samples | ~256 | Number of images/steps to analyze |

    | keep_ratio | ~0.25 | Ratio of layers to keep in FP16 |

    | Optimization steps | 20–25 | Steps for MSE optimization |

    *Exact values may vary depending on the model and dataset.*


    Compatibility

  • ComfyUI: Current mainline
  • Model: SDXL UNet
  • Environment: PyTorch with FP8 support (torch.float8_e4m3fn)

  • Relationship to the Original Project

    Algorithm credit and design belong entirely to the original author.

    This repository exists solely to bridge HSWQ into ComfyUI.

    The original implementation remains the authoritative reference.

    Original repository:

    👉 https://github.com/ussoewwin/Hybrid-Sensitivity-Weighted-Quantization

    Upstream / Reuse

    If any part of this implementation is useful:

  • Feel free to reference this repository.
  • Parts may be extracted or adapted upstream if desired.
  • I am happy to rework parts to better match upstream conventions or extract minimal patches/design notes if helpful.

  • License

    This repository follows the same license terms as the original HSWQ project,

    or provides explicit attribution where applicable.


    Change Log

    2026-02-06

  • Updated SDXLHSWQQuantizer.py to a spec-aligned FP8 quantizer:
  • added scaled vs. unscaled behavior options,
  • optional ComfyUI metadata buffers (comfy_quant, weight_scale),
  • path resolution for stats under ComfyUI output directory,
  • safer keep/skip logic for FP8 and BF16 layers.
  • Updated SDXLHSWQQuantizerLegacy.py to preserve V1.2 compatibility:
  • fixed optimizer parameters,
  • forced scaled=False,
  • no metadata injection.
  • Updated SDXLQuantStatsCollector.py to Dual Monitor V2:
  • session restore + atomic save,
  • per-step wrapper-based capture,
  • higher-precision accumulation for output variance and input importance.