ComfyUI-productfix

ComfyUI-productfix
★ 13

图像生成电商图像文字与Logo保留潜变量注入
在ComfyUI使用Latent Injection生成时保留商品图像的文字、logo与细节,避免变形。
💡 在ComfyUI中生成电商商品图像时保留文字与标志细节。
🍴 4 Forks💻 Python🔄 2025-05-12
📦
网盘下载
复制链接后前往夸克网盘下载
https://pan.quark.cn/s/e58c8376a81b
📦 requirements.txt
diffusers
easyocr
git_header
project_header
ic_light_text
ic_light_adapter
latent_injection_text
latent_injection_text
latent_injection_text
latent_injection_adapter
latent_injection_adapter
producfix_src
productfix_text
producfix_text_closeup
productfix_adapter
producfix_adapter_closeup
upsvaled_results
latent_injection_flow
math
latent_injection_flow
latent_injection_flow
📄 README

🎨 ComfyUI-productfix

ComfyUI custom node that helps generate images while preserving the text, logos, and details of e-commerce products.

🎬 Demo

AI-generated images of items in my room taken with a smartphone (no color correction).

📌 Index

  • Introduction
  • Features
  • Models and Custom Nodes
  • Application
  • Approach
  • Install
  • How to use
  • 🚀 Introduction

    Images generated with Stable Diffusion are visually natural and high-fidelity, but there is an issue where the input object is deformed during generation. This problem is especially noticeable with elements that have artificial regularity, such as text and brand logos. Such deformation issues are a serious limitation when applied to real products sold in e-commerce environments.

    Productfix provides an AI application called Latent Injection, which generates images while preserving the characteristics of the input object (text, logo, details, etc.). It also offers additional nodes that help retain fine details of objects.

    With these nodes, it is expected that much of the post-processing work that previously had to be done with design tools (like Photoshop or Illustrator) can be greatly reduced. You can integrate these custom nodes into your workflow in ComfyUI.

    💡 Features

    Apply Latent Injection

    • Hijacks the KSampler node in ComfyUI to perform Latent Injection.
    • Restores the original KSampler node after execution.

    Get Text Mask

    • Node that loads a text mask as a tensor using the Easy OCR package.
    • Although an Easy OCR custom node already exists (https://github.com/JaidedAI/EasyOCR), this node is recommended because PIL usage is not stable.

    Reset Model Patcher Calculate Weight

    • Many custom nodes (e.g., ComfyUI-Easy-Use https://github.com/yolain/ComfyUI-Easy-Use.git) cause errors if another node has injected the calculate weight function of Modelpatcher.
    • This node resets it to the original Modelpatcher calculate weight to resolve such issues.

    📝 Models and Custom Nodes

    Models

  • realisticVisionV60B1_v51HyperVAE
  • ic light
  • depth controlnet v1.1
  • more detail lora
  • Custom nodes

  • ComfyUI-productfix
  • comfyui_controlnet_aux v1.0.7
  • ComfyUI Impact Pack v8.14.2
  • ComfyUI-Easy-Use v1.3.0
  • ComfyUI_essentials v1.1.0
  • ComfyUI-IC-Light-Native v1.0.1 (not ComfyUI-IC-Light)
  • 🏃🏻‍♂️ Application

  • ### Comparing “IC-Light + Text” / “IC-Light + Text + Latent Injection”
  • condition / Input / IC-Light / latent injection($\sigma_{end}$=1.0) / latent injection($\sigma_{end}$=0.5)

    prompt: product photo, professional photography, realistic, leaf, outdoors / seed: 42

  • ### Comparing “IC-Light + IP-Adapter” / “IC-Light + IP-Adapter + Latent Injection”
  • condition / Input / IC-Light / latent injection($\sigma_{end}$=1.0) / latent injection($\sigma_{end}$=0.5)

    prompt: product photo, professional photography, realistic / seed: 42

    Latent injection truly shines when used together with IC-Light and IP-Adapter. Try it when compositing template-style images and products!

  • ### IC-Light + controlnet + text condition + Text transfer + Latent Injection
  • Items in my room captured with my phone camera

    prompt: product photo, professional photography, realistic, water, bubble / seed: 42 / controlnet: depth

    prompt: product photo, professional photography, realistic, flowers, outdoors / seed: 42 / controlnet: depth

  • ### IC-Light + controlnet + IP-Adapter + Text transfer + Latent Injection
  • Items in my room captured with my phone camera

    prompt: product photo, professional photography, realistic / seed: 42 / controlnet: depth

    prompt: product photo, professional photography, realistic / seed: 42 / controlnet: depth

  • ### Text transfer
  • Input / text condition / image condition(IP-Adapter)

    only IC-Light / Latent injection / detail transfer / Text transfer

    close up

    only IC-Light / Latent injection / detail transfer / Text transfer

    close up

    Text transfer is a detail transfer application based on OCR text masks, developed to preserve the text of input objects. You can implement it using the GetTextMask node and the DetailTransfer node.

  • ### Upscaled results + Text detail transfer
  • 🛠 Approach

  • ### Background: Inpainting
  • Inpainting in diffusion models generates images conditioned on a mask. At each sampling step, the latent space of the original and the generated image is composited based on the mask. This method allows for generation while preserving the input object, but for low-quality input objects (e.g., taken with a smartphone), the output image quality is also limited.

  • ### Background: IC-Light
  • IC-Light is an innovative Adapter UNet that manipulates foreground and background lighting. By relighting the input object, even low-quality objects can be transformed into high-quality output images. However, there are still issues with deformation of object details during foreground generation.

  • ### Background: Kandinsky Inpainting Process
  • Kandinsky diffusion inpainting differs from typical inpainting. When compositing latent spaces at each sampling step, it uses a latent space with noise added according to the scheduler’s sigma value instead of the original. This approach improves quality through consistent noise.

  • ### Background: CLIP Skip
  • CLIP Skip is an inference method where text conditioning is not applied until the last sampling step but is stopped midway. This allows for more contextually appropriate results by controlling the conditioning process.

  • ### Solution: Latent injection
  • $X_t$ : sample

    $M$ : product mask

    $P$ : product latent

    $CO$ : composition operation(ex: add, overlay, soft light etc.)

    To achieve both preservation of object features and meaningful lighting changes, a composite strategy is applied. During the sampling process, latent spaces with added noise are composited to preserve fine object details. Additionally, to reflect the global lighting changes of IC-Light, the initial and final steps of sampling are selectively skipped. This method operates based on the scheduler’s sigma value, ensuring stable performance across various scheduler types. As a result, it is possible to flexibly apply lighting effects while preserving the unique characteristics of the object.

    📥 Install

    cd custom_nodes
    git clone {this repository}
    pip install -r requirements.txt

    🖥 How to use

    ComfyUI-workflows

  • IC-Light + controlnet + text condition + Text transfer + Latent Injection
  • You can download the workflow here.

  • IC-Light + controlnet + IP-Adapter + Text transfer + Latent Injection
  • You can download the workflow here.

    Demo Example Assets

  • Product example image
  • Style example image
  • 📚 Reference

    This project is based on research and code from several papers and open-source repositories.

  • IC-Light: https://github.com/lllyasviel/IC-Light
  • kandinsky2.2: https://github.com/ai-forever/Kandinsky-2
  • clip-skip: https://medium.com/@natsunoyuki/clip-skip-with-the-diffusers-library-b2b63f38a443
  • Anton Razzhigaev, Arseniy Shakhmatov, Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion, arXiv, 2023
  • Chuanxia Zheng, Long Tung Vuong, Jianfei Cai, Dinh Phung, MoVQ: Modulating Quantized Vectors for High-Fidelity Image Generation, arXiv, 2022