ComfyUI-Vui

ComfyUI-Vui
★ 4

音频生成LLaMATransformerComfyUI节点
在ComfyUI中提供Vui节点,基于LLaMA变换器模型预测音频Token,用于音频生成与处理流程集成。
💡 在ComfyUI流程中生成或预测音频Token以合成音频。
🍴 1 Forks💻 Python🔄 2025-06-12
📦
网盘下载
复制链接后前往夸克网盘下载
https://pan.quark.cn/s/a9fb3a59e10c
📦 requirements.txt
einops==0.8.1
inflect==7.5.0
numba==0.61.2
numpy==2.2.6
openai-whisper
pydantic==2.11.5
pyannote.audio==3.3.2
soundfile==0.13.1
tiktoken==0.9.0
torch==2.7.1
torchaudio
transformers==4.52.4
📄 README

ComfyUI-Vui

ComfyUI-Vui is now available in ComfyUI, Vui is a llama based transformer that predicts audio tokens.

Installation

  • Make sure you have ComfyUI installed
  • Clone this repository into your ComfyUI’s custom_nodes directory:
  • cd ComfyUI/custom_nodes
    git clone https://github.com/Yuan-ManX/ComfyUI-Vui.git

  • Install dependencies:
  • cd ComfyUI-Vui
    pip install -r requirements.txt

    Model

    Download Pretrained Models

    Vui Pretrained Models

  • Vui.BASE is base checkpoint trained on 40k hours of audio conversations
  • Vui.ABRAHAM is a single speaker model that can reply with context awareness.
  • Vui.COHOST is checkpoint with two speakers that can talk to each other.