ComfyUI-Vui

★ 4

音频生成LLaMATransformerComfyUI节点

在ComfyUI中提供Vui节点，基于LLaMA变换器模型预测音频Token，用于音频生成与处理流程集成。

💡 在ComfyUI流程中生成或预测音频Token以合成音频。

🍴 1 Forks💻 Python🔄 2025-06-12

🔗 GitHub 原文

📦

网盘下载

复制链接后前往夸克网盘下载

https://pan.quark.cn/s/a9fb3a59e10c

📦 requirements.txt

einops==0.8.1
inflect==7.5.0
numba==0.61.2
numpy==2.2.6
openai-whisper
pydantic==2.11.5
pyannote.audio==3.3.2
soundfile==0.13.1
tiktoken==0.9.0
torch==2.7.1
torchaudio
transformers==4.52.4

📄 README

ComfyUI-Vui

ComfyUI-Vui is now available in ComfyUI, Vui is a llama based transformer that predicts audio tokens.

Installation

Make sure you have ComfyUI installed

Clone this repository into your ComfyUI’s custom_nodes directory:

cd ComfyUI/custom_nodes
git clone https://github.com/Yuan-ManX/ComfyUI-Vui.git

Install dependencies:

cd ComfyUI-Vui
pip install -r requirements.txt

Model

Download Pretrained Models

Vui Pretrained Models

Vui.BASE is base checkpoint trained on 40k hours of audio conversations

Vui.ABRAHAM is a single speaker model that can reply with context awareness.

Vui.COHOST is checkpoint with two speakers that can talk to each other.