ComfyUI-Dia

★ 6

ComfyUI插件Dia集成图像处理Python扩展

在 ComfyUI 中集成 Dia，使用户可在界面内直接调用 Dia 功能，便于扩展图像处理流程。

💡 在 ComfyUI 中直接调用 Dia 完成图像处理任务

🍴 2 Forks💻 Python🔄 2025-04-24

🔗 GitHub 原文

📦

网盘下载

复制链接后前往夸克网盘下载

https://pan.quark.cn/s/a9fb3a59e10c

📦 requirements.txt

descript-audio-codec
huggingface-hub
numpy
pydantic
soundfile
torch
torchaudio
triton

📄 README

ComfyUI-Dia

Make Dia avialbe in ComfyUI.

A TTS model capable of generating ultra-realistic dialogue in one pass.

Dia is a 1.6B parameter text to speech model created by Nari Labs. Dia directly generates highly realistic dialogue from a transcript. You can condition the output on audio, enabling emotion and tone control. The model can also produce nonverbal communications like laughter, coughing, clearing throat, etc.

Installation

Make sure you have ComfyUI installed

Clone this repository into your ComfyUI’s custom_nodes directory:

cd ComfyUI/custom_nodes
git clone https://github.com/Yuan-ManX/ComfyUI-Dia.git

Install dependencies:

cd ComfyUI-Dia
pip install -r requirements.txt

Model

Pretrained model checkpoints – The model weights are hosted on Hugging Face. The model only supports English generation at the moment.