ComfyUI-Audio_Quality_Enhancer

ComfyUI-Audio_Quality_Enhancer
★ 45

音频处理AI增强实时音效跨平台
为ComfyUI提供专业级音频处理与AI增强,支持移调、变速、音量/归一化、混响与回声等效果,便于在流程中无缝集成高质量音频。
💡 在ComfyUI流程中对音频进行专业级增强与效果处理。
🍴 5 Forks💻 Python🔄 2025-05-11
📦
网盘下载
复制链接后前往夸克网盘下载
https://pan.quark.cn/s/8f9eee5e2cdb
📦 requirements.txt
#
Core
dependencies
torch
numpy
soundfile
scipy
#
Audio
processing
libraries
librosa
#
recommended
for
AudioQualityEnhancer
demucs
pedalboard
image
image
📄 README

ComfyUI-Audio-Quality-Enhancer

This extension adds advanced audio processing capabilities to ComfyUI with professional-grade audio effects and AI-powered audio enhancement.

Use With ACE Step

Features

🎛️ AI Audio Effects Node

  • Pitch Shifting: Adjust pitch from -12 to +12 semitones
  • Speed Adjustment: Modify playback speed from 0.5x to 2.0x
  • Volume Control: Professional gain control with anti-clipping protection
  • Audio Normalization: Automatic level balancing
  • Reverb: Studio-quality reverb with adjustable room size and amount
  • Echo: Configurable delay and decay for spatial effects
  • Cross-platform: Works on Windows, Linux/WSL, and macOS using SoX
  • 🔊 AI Audio Enhancer Pro Node

  • Source Separation: Powered by Demucs to enhance specific audio elements
  • Targeted Enhancement: Individually process vocals, drums, bass, and other instruments
  • Audio Quality Controls:
  • Enhancement Level: Master control for overall processing intensity
  • Clarity: Mid-frequency enhancement for improved definition
  • Dynamics: Adjustable compression and transient enhancement
  • Warmth: Low-frequency enhancement for richness
  • Air & Brilliance: High-frequency enhancement for sparkle
  • Dolby-like Stereo Effect: Enhanced stereo imaging
  • Fallback Processing: Works even without source separation libraries
  • Installation

    1. Install the Extension

    Clone this repository into your ComfyUI’s custom_nodes directory:

    cd ComfyUI/custom_nodes
    git clone https://github.com/ShmuelRonen/ComfyUI-Audio-Quality-Enhancer.git

    2. Install Required Python Dependencies

    cd ComfyUI-Audio-Quality-Enhancer
    pip install -r requirements.txt

    3. Install SoX (Required for Audio Effects)

    Windows

  • Download SoX for Windows from the official SourceForge page
  • Download the .exe installer (e.g., sox-14.4.2-win32.exe)
  • Run the installer:
  • Follow the installation prompts
  • Important: Note the installation directory (default is usually C:\Program Files (x86)\sox-14-4-2\)
  • No need to add to PATH – the extension uses the direct path to SoX
  • WSL 2 (Ubuntu)

    sudo apt-get update
    sudo apt-get install sox

    macOS

    brew install sox

    4. Optional: Install Advanced Audio Libraries

    For full functionality of the Audio Enhancer Pro node, install these additional packages:

    pip install demucs pedalboard

    These are optional – the node will work without them but with reduced functionality.

    5. Restart ComfyUI

    After installing all required components, restart ComfyUI to load the extension.

    Nodes

    AI Audio Effects

    Applies high-quality audio processing to any audio input.

    Inputs:

  • audio: Audio data from any audio-generating node
  • pitch_shift: Semitone adjustment (-12 to +12)
  • speed_factor: Playback speed modifier (0.5x to 2.0x)
  • sox_path (optional): Custom path to SoX executable
  • gain_db (optional): Volume adjustment in decibels
  • use_limiter (optional): Enable/disable limiter for positive gain
  • normalize_audio (optional): Enable/disable audio normalization
  • add_reverb (optional): Enable/disable reverb effect
  • reverb_amount (optional): Reverb intensity
  • reverb_room_scale (optional): Size of virtual space
  • add_echo (optional): Enable/disable echo effect
  • echo_delay (optional): Time between echo repetitions
  • echo_decay (optional): How quickly echo fades
  • Outputs:

  • audio: Processed audio data
  • AI Audio Enhancer Pro

    Enhances audio quality using source separation and targeted processing.

    Inputs:

  • audio: Audio data from any audio-generating node
  • enhancement_level: Master control for overall enhancement intensity
  • use_source_separation (optional): Enable/disable Demucs separation
  • demucs_model (optional): Model choice for source separation
  • device (optional): Processing device (CUDA/CPU)
  • vocals_enhance (optional): Vocals enhancement level
  • drums_enhance (optional): Drums enhancement level
  • bass_enhance (optional): Bass enhancement level
  • other_enhance (optional): Other instruments enhancement level
  • clarity (optional): Mid-frequency clarity enhancement
  • dynamics (optional): Dynamic range processing
  • warmth (optional): Low-frequency enhancement
  • air (optional): High-frequency “air” enhancement
  • dolby_effect (optional): Stereo width enhancement
  • simple_mode (optional): Processing mode without source separation
  • apply_limiter (optional): Final limiter to prevent clipping
  • Outputs:

  • audio: Enhanced audio data
  • Audio Effect Tips

    Volume Control

  • Gain Control: Use gain_db to increase or decrease volume without distortion
  • Positive values (0 to +20 dB): Increase volume with automatic clipping prevention
  • Negative values (-20 to 0 dB): Decrease volume
  • For best results with multiple effects, set gain last in your workflow
  • Normalization: Enable normalize_audio to automatically balance levels
  • Great for ensuring consistent volume across different audio samples
  • Applied before other effects for best results
  • Reverb

    Reverb adds a sense of space to your audio. Here are some suggested settings:

  • Small Room: reverb_amount = 20, reverb_room_scale = 25
  • Medium Room: reverb_amount = 40, reverb_room_scale = 50
  • Large Hall: reverb_amount = 70, reverb_room_scale = 80
  • Cathedral: reverb_amount = 90, reverb_room_scale = 95
  • Echo

    Echo creates repeating sound reflections. Good settings to try:

  • Subtle Echo: echo_delay = 0.3, echo_decay = 0.3
  • Moderate Echo: echo_delay = 0.5, echo_decay = 0.5
  • Canyon Echo: echo_delay = 1.0, echo_decay = 0.7
  • Effect Combinations

  • Phone Call: pitch_shift = 0, speed_factor = 1.0, add_reverb = True, reverb_amount = 10, reverb_room_scale = 10
  • Radio Announcer: pitch_shift = -2, speed_factor = 0.9, add_reverb = True, reverb_amount = 20, gain_db = 3
  • Stadium Announcement: pitch_shift = 0, speed_factor = 1.0, add_reverb = True, reverb_amount = 60, add_echo = True, echo_delay = 0.8
  • Child Voice: pitch_shift = 4, speed_factor = 1.1, gain_db = 2
  • Deep Voice: pitch_shift = -4, speed_factor = 0.9, gain_db = -2
  • Audio Enhancer Tips

    Source Separation Modes

    The use_source_separation option dramatically changes how the Audio Enhancer Pro works:

  • With Source Separation (Recommended):
  • Individual processing of vocals, drums, bass, and other instruments
  • Best for music and complex audio
  • Requires more processing power and the Demucs library
  • Without Source Separation:
  • Simpler, frequency-based enhancement
  • Faster processing
  • Works without additional libraries
  • Two processing modes available: “Standard” (gentle) and “Aggressive” (stronger)
  • Enhancement Presets

    Here are some effective enhancement combinations:

  • Vocal Clarity: vocals_enhance = 0.8, clarity = 0.6, dynamics = 0.4, air = 0.5
  • Bass Boost: bass_enhance = 0.9, warmth = 0.7, dynamics = 0.5
  • Full Mix Master: enhancement_level = 0.6, clarity = 0.5, dynamics = 0.6, warmth = 0.4, air = 0.5
  • Lo-Fi Effect: enhancement_level = 0.3, warmth = 0.8, air = 0.1, simple_mode = “Aggressive”
  • Podcast Voice: vocals_enhance = 0.7, clarity = 0.7, dynamics = 0.6, warmth = 0.3
  • Usage Examples

    Basic Audio Processing

  • Add any audio-generating node (TTS, audio loader, etc.)
  • Add “AI Audio Effects”
  • Connect the audio output to the effects node input
  • Adjust pitch, speed, reverb, or other settings
  • Connect to “Preview Audio” node to hear the result
  • Advanced Audio Enhancement

  • Add any audio-generating node
  • Add “AI Audio Enhancer Pro”
  • Enable source separation for best quality
  • Adjust enhancement parameters for vocals, bass, etc.
  • Connect to “Preview Audio” node
  • Combined Processing

    For maximum quality, you can chain both nodes:

  • Add any audio-generating node
  • Add “AI Audio Enhancer Pro” for quality enhancement
  • Add “AI Audio Effects” for creative effects
  • Connect in sequence: Audio Source → Enhancer → Effects → Preview
  • Use Enhancer for quality improvement and Effects for creative sound design
  • Cross-Platform Compatibility

    This extension has been tested and works on:

  • Windows 10/11
  • Linux (including WSL 2 on Windows)
  • macOS
  • Different environments may require specific setup steps:

    Windows Notes

  • SoX is automatically located in standard installation directories
  • If installed elsewhere, provide the full path in the effects node
  • Performance is best with CUDA-enabled GPUs for the Enhancer node
  • WSL 2 Notes

  • SoX is automatically located through the system PATH
  • Enhancer node works well with CPU mode if CUDA isn’t available in WSL
  • macOS Notes

  • Install SoX via Homebrew for best compatibility
  • Enhancer node defaults to CPU mode
  • SoX Troubleshooting

    Windows

    If you encounter issues with SoX:

  • Verify the SoX path in the “AI Audio Effects” node:
  • Default: C:\Program Files (x86)\sox-14-4-2\sox.exe
  • If your installation is in a different location, provide the full path to sox.exe
  • Check if SoX is installed correctly:
  • Open Command Prompt
  • Run "C:\Program Files (x86)\sox-14-4-2\sox.exe" --version
  • If you get an error, reinstall SoX
  • WSL 2 (Ubuntu)

  • Verify SoX installation:
  • “`bash

    sox –version

    “`

  • If SoX is not found, install it:
  • “`bash

    sudo apt-get update

    sudo apt-get install sox

    “`

    Enhanced Audio Processing

    The AI Audio Enhancer Pro node uses several techniques for high-quality processing:

  • Source Separation: Uses Demucs to separate audio into stems for targeted processing
  • Transient Enhancement: Improves attack and clarity of percussion and rhythmic elements
  • Harmonic Processing: Enhances tonal quality of musical elements
  • Frequency-Specific Processing: Tailored enhancement for different parts of the spectrum
  • Adaptive Dynamics: Intelligent compression and expansion based on audio content
  • License

    This project is provided under the MIT License. See LICENSE file for details.

    Credits

  • SoX audio processing library: SoX – Sound eXchange
  • Demucs source separation by Meta Research
  • ComfyUI: ComfyUI