ComfyUI-EdgeTTS

★ 68

文本转语音微软 Edge TTS多语言支持ComfyUI 节点

基于微软 Edge TTS 的 ComfyUI 文本转语音节点，支持多语言多音色、易集成与定制，可生成自然语音并提升交互体验

💡 在ComfyUI流程中将文本转换为自然语音，用于交互提示或配音

🍴 8 Forks💻 Python🔄 2026-01-25

🔗 GitHub 原文

📦

网盘下载

复制链接后前往夸克网盘下载

https://pan.quark.cn/s/8f9eee5e2cdb

📦 requirements.txt

edge-tts>=7.0.0
torchaudio
torchcodec==0.9
openai-whisper>=20231117
numpy
#mutagen>=1.47.0
#
For
audio
metadata
handling
torch
googletrans-py>=3.0.0
deep-translator>=1.11.4

📄 README

ComfyUI Audio Nodes

ComfyUI-EdgeTTS is a powerful text-to-speech node for ComfyUI, leveraging Microsoft’s Edge TTS capabilities. It enables seamless conversion of text into natural-sounding speech, supporting multiple languages and voices. Ideal for enhancing user interactions, this node is easy to integrate and customize, making it perfect for various applications.

https://github.com/user-attachments/assets/a5b9165b-a413-49fd-989e-0ef3141afce7

Updates

V1.2.2 (2026-01-25) – Voice ID update requirments update log.

V1.2.1 (2025-07-23) – Voice ID update & Bug Fixed update log.

V1.2.0 (2025-06-20) – Simplified voice display format, improved performance with lazy loading and caching, and reduced memory usage. For more information, please see the update log.

V1.1.0 (2025-01-24) – Added 19 new languages and 38 new voices, with more detailed characteristics for existing Chinese voices. For more information, please see the update log.

Features

Edge TTS Node

Edge TTS: Convert text to speech using Microsoft Edge TTS

Multiple languages and voices support

Adjustable speech rate and pitch

High-quality voice synthesis

Configurable via config.json

Speech to Text Node

Whisper STT: High-accuracy speech recognition

Multiple language support with auto-detection

Multiple model sizes (tiny to large)

Supports ComfyUI audio format

Language detection confidence reporting

Audio File Node

Save Audio: Export audio files

Supports WAV, MP3, FLAC formats

Quality presets (high/medium/low)

Custom file naming and paths

Automatic file numbering

Installation

Method 1. install on ComfyUI-Manager, search `Comfyui-EdgeTTS` and install

install requirment.txt in the ComfyUI-EdgeTTS folder

“`bash

./ComfyUI/python_embeded/python -m pip install -r requirements.txt

“`

Method 2. Clone this repository to your ComfyUI custom_nodes folder:

“`bash

cd ComfyUI/custom_nodes

git clone https://github.com/1038lab/ComfyUI-EdgeTTS.git

“`

install requirment.txt in the ComfyUI-EdgeTTS folder

“`bash

./ComfyUI/python_embeded/python -m pip install -r requirements.txt

“`

Requirements

Python packages (see requirements.txt)

FFmpeg on system PATH (required by Whisper STT)

Example (Windows PowerShell): $env:Path += ";F:\\FFmpeg\\bin" then restart ComfyUI

CUDA compatible GPU (optional, for faster Whisper processing)

[!NOTE]

### Important: torchaudio 2.9+ requires torchcodec

If you use torch/torchaudio 2.9 or newer, torchaudio.load/save requires torchcodec.

Recommended fix (match PyTorch 2.9.x):

./ComfyUI/python_embeded/python -m pip uninstall -y torchcodec
./ComfyUI/python_embeded/python -m pip install --no-cache-dir "torchcodec==0.9"

Text to Speech

Add Edge TTS node to workflow

Input text and select voice

Adjust speed and pitch if needed

Connect to Save Audio node for export

Speech to Text

Add Whisper STT node

Connect audio input

Select model size and language (or auto-detect)

Run to get transcription

Supported Voices

| Language | Female Voices | Male Voices |

|———-|————–|————-|

| Chinese-Mainland | XiaoXiao (News, Novel, Warm), XiaoYi (Cartoon, Novel, Lively) | Yunjian (Sports, Novel, Passion), Yunxi (Novel, Lively), Yunxia (Cartoon, Novel), Yunyang (News, Professional) |

| Chinese-Cantonese | HiuGaai (Friendly), HiuMaan (Friendly) | WanLung (Friendly) |

| Chinese-Taiwan | HsiaoChen (Friendly), HsiaoYu (Friendly) | YunJhe (Friendly) |

| English-US | Jenny (Friendly), Aria (Positive), Ana (Cute), Michelle (Friendly) | Guy (Passion), Christopher (Authority), Eric (Rational), Roger (Lively), Steffan (Rational) |

| English-GB | Libby (Friendly), Maisie (Friendly), Sonia (Friendly) | Ryan (Friendly), Thomas (Friendly) |

| English-AU | Natasha (Friendly) | William (Friendly) |

| Japanese | Nanami (Friendly) | Keita (Friendly) |

| Korean | SunHi (Friendly) | InJoon (Friendly), Hyunsu (Multilingual) |

| French-FR | Denise (Friendly), Eloise (Friendly), Vivienne (Multilingual) | Henri (Friendly), Remy (Multilingual) |

| French-CA | Sylvie (Friendly) | Jean (Friendly), Antoine (Friendly) |

| German-DE | Katja (Friendly), Amala (Friendly), Seraphina (Multilingual) | Conrad (Friendly), Killian (Friendly), Florian (Multilingual) |

More voices available in config.json, including voices for:

German (AT/CH)

Spanish (ES/MX)

Russian

Italian

Portuguese (BR/PT)

Dutch

Polish

Turkish

Arabic

Hindi

Indonesian

Vietnamese

Thai

Ukrainian

And many more…

Each language provides at least one male and female voice option, allowing you to choose different voice styles based on your needs.

Credits

Edge TTS: Microsoft Edge TTS

Whisper: OpenAI Whisper

ComfyUI-EdgeTTS

ComfyUI Audio Nodes

Updates

Features

Edge TTS Node

Speech to Text Node

Audio File Node

Installation

Method 1. install on ComfyUI-Manager, search Comfyui-EdgeTTS and install

Method 2. Clone this repository to your ComfyUI custom_nodes folder:

Requirements

Text to Speech

Speech to Text

Supported Voices

Credits

Method 1. install on ComfyUI-Manager, search `Comfyui-EdgeTTS` and install