ComfyUI-Dia

ComfyUI-Dia
★ 6

ComfyUI插件Dia集成图像处理Python扩展
在 ComfyUI 中集成 Dia,使用户可在界面内直接调用 Dia 功能,便于扩展图像处理流程。
💡 在 ComfyUI 中直接调用 Dia 完成图像处理任务
🍴 2 Forks💻 Python🔄 2025-04-24
📦
网盘下载
复制链接后前往夸克网盘下载
https://pan.quark.cn/s/a9fb3a59e10c
📦 requirements.txt
descript-audio-codec
huggingface-hub
numpy
pydantic
soundfile
torch
torchaudio
triton
📄 README

ComfyUI-Dia

Make Dia avialbe in ComfyUI.

A TTS model capable of generating ultra-realistic dialogue in one pass.

Dia is a 1.6B parameter text to speech model created by Nari Labs. Dia directly generates highly realistic dialogue from a transcript. You can condition the output on audio, enabling emotion and tone control. The model can also produce nonverbal communications like laughter, coughing, clearing throat, etc.

Installation

  • Make sure you have ComfyUI installed
  • Clone this repository into your ComfyUI’s custom_nodes directory:
  • cd ComfyUI/custom_nodes
    git clone https://github.com/Yuan-ManX/ComfyUI-Dia.git

  • Install dependencies:
  • cd ComfyUI-Dia
    pip install -r requirements.txt

    Model

    Pretrained model checkpoints – The model weights are hosted on Hugging Face. The model only supports English generation at the moment.