ComfyUI-ChatterboxTTS

★ 13

文本转语音TTS开源模型ComfyUI节点

在 ComfyUI 中集成 Chatterbox TTS，将文本高质量合成为语音，提供生产级开源 TTS 能力。

💡 将生成或输入的文本快速合成为高质量语音用于项目或演示。

🍴 3 Forks💻 Python🔄 2025-05-30

🔗 GitHub 原文

📦

网盘下载

复制链接后前往夸克网盘下载

https://pan.quark.cn/s/a9fb3a59e10c

📦 requirements.txt

numpy~=1.26.0
resampy==0.4.3
librosa==0.10.0
s3tokenizer
torch==2.6.0
torchaudio==2.6.0
transformers==4.46.3
diffusers==0.29.0
resemble-perth==1.0.1
omegaconf==2.3.0
conformer==0.3.2
chatterbox-tts

📄 README

ComfyUI-ChatterboxTTS

ComfyUI-ChatterboxTTS is now available in ComfyUI, Chatterbox TTS is the first production-grade open-source TTS model.

Installation

Make sure you have ComfyUI installed

Clone this repository into your ComfyUI’s custom_nodes directory:

cd ComfyUI/custom_nodes
git clone https://github.com/Yuan-ManX/ComfyUI-ChatterboxTTS.git

Install dependencies:

cd ComfyUI-ChatterboxTTS
pip install chatterbox-tts
pip install -r requirements.txt

Model

Download Pretrained Models

Chatterbox TTS Pretrained Models

Tips

General Use (TTS and Voice Agents):

The default settings (exaggeration=0.5, cfg_weight=0.5) work well for most prompts.

If the reference speaker has a fast speaking style, lowering cfg_weight to around 0.3 can improve pacing.

Expressive or Dramatic Speech:

Try lower cfg_weight values (e.g. ~0.3) and increase exaggeration to around 0.7 or higher.

Higher exaggeration tends to speed up speech; reducing cfg_weight helps compensate with slower, more deliberate pacing.