音频生成LLaMATransformerComfyUI节点
在ComfyUI中提供Vui节点,基于LLaMA变换器模型预测音频Token,用于音频生成与处理流程集成。
💡 在ComfyUI流程中生成或预测音频Token以合成音频。
🍴 1 Forks💻 Python🔄 2025-06-12
https://pan.quark.cn/s/a9fb3a59e10c
📦 requirements.txt
einops==0.8.1
inflect==7.5.0
numba==0.61.2
numpy==2.2.6
openai-whisper
pydantic==2.11.5
pyannote.audio==3.3.2
soundfile==0.13.1
tiktoken==0.9.0
torch==2.7.1
torchaudio
transformers==4.52.4
📄 README
ComfyUI-Vui
ComfyUI-Vui is now available in ComfyUI, Vui is a llama based transformer that predicts audio tokens.
Installation
Make sure you have ComfyUI installed
Clone this repository into your ComfyUI’s custom_nodes directory:
cd ComfyUI/custom_nodes
git clone https://github.com/Yuan-ManX/ComfyUI-Vui.git
Install dependencies:
cd ComfyUI-Vui
pip install -r requirements.txt
Model
Download Pretrained Models
Vui Pretrained Models
Vui.BASE is base checkpoint trained on 40k hours of audio conversations
Vui.ABRAHAM is a single speaker model that can reply with context awareness.
Vui.COHOST is checkpoint with two speakers that can talk to each other.