ComfyStereo

ComfyStereo
★ 41

立体渲染视差图生成ComfyUI插件性能优化
为ComfyUI提供两款立体图像生成节点:Stereo Image Node(基于Automatic1111深度脚本)与 LazyStereo,快速生成左右视图与视差图。
💡 在ComfyUI中生成左右视图与视差图用于VR或3D合成。
🍴 4 Forks💻 Python🔄 2026-02-26
📦
网盘下载
复制链接后前往夸克网盘下载
https://pan.quark.cn/s/e58c8376a81b
📦 requirements.txt
#
Core
dependencies
for
basic
stereo
image
generation
torch
numpy
Pillow
opencv-python
numba
scipy
psutil
moderngl
#
Native
VR
Viewer
Dependencies
(PyOpenXR)
#
For
native
VR
viewing
with
auto-launch
to
headset
pyopenxr>=1.0.0
PyOpenGL>=3.1.0
PyOpenGL_accelerate>=3.1.0
glfw>=2.0.0
opencv-python>=4.0.0
#
For
video
playback
support
pygame>=2.0.0
#
For
audio
playback
#
Note:
ffmpeg
must
be
installed
separately
for
audio
extraction
from
videos
#
Windows:
Download
from
https://ffmpeg.org/download.html
#
Linux:
sudo
apt
install
ffmpeg
#
Mac:
brew
install
ffmpeg
#
StereoDiffusion
Dependencies
#
For
AI-powered
stereo
generation
using
diffusion
models
diffusers>=0.21.0
transformers
accelerate
einops
tqdm
scikit-image
#
Note:
torch,
numpy,
Pillow,
and
opencv-python
are
already
included
in
base
dependencies
above
Stereo Video
📄 README

ComfyStereo – Stereoscopic 3D Toolkit for ComfyUI

A stereoscopic 3D toolkit for ComfyUI that combines three solutions into one unified package:

  • Stereo Image Generation – Depth-based stereo conversion with GPU acceleration
  • Native VR Viewing – PyOpenXR viewer for direct VR headset viewing
  • StereoDiffusion – AI-powered stereo generation using diffusion models

  • Example Workflow

    This example demonstrates the full stereo conversion pipeline using a depth map and video input.

    Example Workflow Graph

    Download workflow:

    Video2Stereo.json — Import directly into ComfyUI


    Final Stereo Output

    This is the final stereoscopic video generated by ComfyStereo.

    Direct link:

    stereovideo.mp4


    Original Input Video

    This is the original monoscopic input video before stereo processing.

    Direct link:

    example-video.mp4


    Depth Map Used

    This is the depth map generated and used in the workflow.

    Direct link:

    depthmap_video.webm


    Features Overview

    Core Stereo Generation

  • GPU-Accelerated Processing – 5-20x faster depth processing with CUDA
  • Advanced Fill Techniques – Multiple interpolation methods (Polylines, Naive, Hybrid Edge, GPU Warp)
  • Edge-Aware Depth Blurring – Reduces artifacts at high divergence settings
  • Multiple Output Formats – Side-by-Side, Top-Bottom, Red-Cyan Anaglyph
  • Batch Video Processing – Memory-efficient video frame processing
  • Native VR Viewing

  • Auto-Launch to Headset – Direct VR viewing without browser
  • Multiple Stereo Formats – Side-by-Side, Over-Under, Mono
  • Projection Options – Flat, Curved, 180° Dome, 360° Sphere
  • Image & Video Support – View both stereo images and videos
  • All VR Headsets – Quest, Vive, Index, WMR, and more
  • StereoDiffusion AI

  • AI-Powered Generation – Uses diffusion models for stereo creation
  • DDIM Inversion – Null-text optimization for high-quality reconstruction
  • Bilateral Neighbor Attention – Stereo-consistent diffusion
  • ComfyUI Native Models – Works with MODEL/CLIP/VAE inputs
  • Diffusers Support – Also works with HuggingFace model IDs

  • Installation

    Method 1: ComfyUI Manager (Recommended)

  • Open ComfyUI Manager
  • Search for “ComfyStereo”
  • Click Install
  • Restart ComfyUI
  • Method 2: Manual Installation

  • Clone the repository:
  • cd ComfyUI/custom_nodes/
    git clone https://github.com/Dobidop/ComfyStereo.git
    cd ComfyStereo

  • Install dependencies (choose one):
  • # Base only (stereo generation + DeoVR)
    pip install -r requirements.txt

  • Restart ComfyUI
  • Available Nodes

    Stereo Image Generation Nodes

    1. Stereo Image Node

    The main node for depth-based stereo conversion.

    Inputs:

  • image (IMAGE) – Source image
  • depth_map (IMAGE) – Depth map (grayscale)
  • divergence (FLOAT) – Stereo effect strength (0.05-15.0, default: 3.5)
  • separation (FLOAT) – Additional horizontal shift (-5.0 to 5.0)
  • stereo_balance (FLOAT) – Effect distribution between eyes (-0.95 to 0.95)
  • convergence_point (FLOAT) – Depth level at screen plane (0.0-1.0, default: 0.5)
  • modes – Output format: left-right, right-left, top-bottom, bottom-top, red-cyan-anaglyph
  • fill_technique – Infill method (see Infill Methods)
  • depth_map_blur (BOOLEAN) – Enable edge-aware depth blurring
  • depth_blur_edge_threshold (FLOAT) – Gradient sharpness cutoff (0.1-15.0)
  • batch_size (INT) – Frames per memory cleanup cycle
  • Outputs:

  • stereoscope (IMAGE) – Final stereo image
  • blurred_depthmap_left (IMAGE) – Processed left depth map
  • blurred_depthmap_right (IMAGE) – Processed right depth map
  • no_fill_imperfect_mask (MASK) – Unfilled region mask
  • Native VR Viewer Nodes

    2. Native Stereo Image Viewer

    Auto-launches images directly into VR headset.

    Inputs:

  • image (IMAGE) – Stereo image
  • stereo_format – Side-by-Side, Over-Under, Mono
  • projection_type – Flat Screen, Curved Screen, 180° Dome, 360° Sphere
  • screen_size (FLOAT) – Virtual screen size (1.0-10.0)
  • screen_distance (FLOAT) – Distance from viewer (1.0-10.0)
  • swap_eyes (BOOLEAN) – Swap left/right
  • auto_launch (BOOLEAN) – Launch into headset
  • background_color – Black, Dark Gray, Gray, White
  • Outputs:

  • passthrough (IMAGE) – Original image
  • 3. Native Stereo Video Viewer

    Play stereo videos in VR with keyboard controls.

    Inputs:

  • video_path (STRING) – Path to stereo video
  • stereo_format – Side-by-Side, Over-Under, Mono
  • projection_type – Flat Screen, Curved Screen, 180° Dome, 360° Sphere
  • screen_size (FLOAT) – Virtual screen size
  • screen_distance (FLOAT) – Distance from viewer
  • swap_eyes (BOOLEAN)
  • loop_video (BOOLEAN)
  • auto_launch (BOOLEAN)
  • background_color
  • 4. Native VR Status

    Check PyOpenXR installation and VR runtime availability.

    Outputs:

  • status_message (STRING) – Diagnostic information
  • is_available (BOOLEAN) – VR readiness
  • StereoDiffusion AI Nodes

    5. StereoDiffusion Node

    AI-powered stereo generation using diffusion models.

    Inputs:

  • image (IMAGE) – Source image
  • depth_map (IMAGE) – Depth map
  • scale_factor (FLOAT) – Disparity strength (1.0-20.0, default: 9.0)
  • direction – “uni” (unidirectional) or “bi” (bidirectional) attention
  • deblur (BOOLEAN) – Add noise to unfilled regions
  • num_ddim_steps (INT) – DDIM steps (10-100, default: 50)
  • null_text_optimization (BOOLEAN) – Enable for better quality (slower)
  • guidance_scale (FLOAT) – CFG scale (1.0-20.0, default: 7.5)
  • model_id (STRING) – HuggingFace model ID (fallback if MODEL/CLIP/VAE not provided)
  • Outputs:

  • stereo_pair (IMAGE) – Side-by-side stereo image
  • left_image (IMAGE) – Left eye view
  • right_image (IMAGE) – Right eye view
  • Supports: SD1.x and SD2.x models (SDXL/FLUX planned)

    Key Parameters Explained

    Divergence

    Controls the strength of the 3D effect. Higher values = more depth perception.

  • Low (1-3): Subtle depth
  • Medium (3-7): Balanced effect
  • High (7-15): Extreme pop-out
  • Convergence Point

    Controls which depth appears at screen plane (zero parallax).

  • 0.0 = Nearest depth at screen → Content recedes behind screen
  • 0.5 = Mid-depth at screen → Balanced (default)
  • 1.0 = Furthest depth at screen → Content pops toward viewer
  • Use cases:

  • Pop-out mode (1.0): Product displays, comics
  • Window mode (0.0): Subtle depth, natural recession
  • Portrait mode (0.6-0.7): Face at screen, background recedes
  • Landscape mode (0.3-0.4): Foreground pops, horizon recedes
  • Stereo Balance

    Distributes divergence between eyes.

  • 0.0 = Even distribution
  • Positive/negative = Shift effect toward one eye
  • Separation

    Additional horizontal shift percentage (independent of depth).

    GPU Acceleration

    Depth processing is automatically GPU-accelerated when CUDA is available:

  • 5-20x faster blur operations
  • Automatic fallback to CPU if GPU unavailable
  • Zero configuration – works out of the box
  • Native VR Setup

    Requirements

  • Install PyOpenXR dependencies:
  • pip install -r requirements.txt

  • Install a VR runtime:
  • SteamVR (recommended) – Supports most headsets
  • Oculus Runtime – For Meta Quest headsets
  • Windows Mixed Reality – Built into Windows 10/11
  • Connect your VR headset
  • Supported Headsets

  • Meta Quest (1, 2, 3, Pro)
  • HTC Vive / Vive Pro
  • Valve Index
  • Windows Mixed Reality headsets
  • Any OpenXR-compatible device
  • StereoDiffusion Setup

    Requirements

  • CUDA-capable GPU with 8GB+ VRAM (16GB recommended)
  • Python 3.8+
  • PyTorch 2.0+
  • First Run

  • Downloads Stable Diffusion model (releaseversion SD1.5) (~4GB) if not cached
  • Null-text optimization takes ~2-3 minutes on modern GPU
  • Model is cached for faster subsequent runs
  • Performance Tips

  • Lower num_ddim_steps to 30 for faster processing
  • Disable null_text_optimization for 3x speed (lower quality)
  • Use guidance_scale 3-5 to reduce “burned” look
  • Troubleshooting StereoDiffusion

  • Out of Memory: Reduce num_ddim_steps, close other apps
  • Black Output: Check depth map is valid grayscale
  • Poor Quality: Enable null_text_optimization, increase num_ddim_steps
  • License

    MIT License – see LICENSE file for details.

    Note: This project includes code from multiple sources:

  • StereoDiffusion components are based on StereoDiffusion (MIT License)
  • Diffusion utilities derived from prompt-to-prompt (Apache 2.0)
  • See NOTICE file for full attribution
  • Credits

    Created by Dobidop

    Acknowledgments

    StereoDiffusion

    @inproceedings{wang2024stereodiffusion,
      title={StereoDiffusion: Training-Free Stereo Image Generation Using Latent Diffusion Models},
      author={Wang, Lezhong and Frisvad, Jeppe Revall and Jensen, Mark Bo and Bigdeli, Siavash Arjomand},
      booktitle={CVPR},
      year={2024}
    }

    Prompt-to-Prompt

    @article{hertz2022prompt,
      title={Prompt-to-prompt image editing with cross attention control},
      author={Hertz, Amir and Mokady, Ron and Tenenbaum, Jay and Aberman, Kfir and Pritch, Yael and Cohen-Or, Daniel},
      year={2022}
    }

    Contributing

    Contributions welcome! Please submit issues or pull requests.

    Support

  • Issues: GitHub Issues